University College London Featured PhD Programmes
FindA University Ltd Featured PhD Programmes
Engineering and Physical Sciences Research Council Featured PhD Programmes
University of Sheffield Featured PhD Programmes
University of Reading Featured PhD Programmes

Machine learning and multi-level GLM models as alternatives to logistic regression in propensity score estimation for the study of medical devices using observational data

This project is no longer listed on and may not be available.

Click here to search for PhD studentship opportunities
  • Full or part time
    Prof D Prieto-Alhambra
    Dr V Strauss
    Dr S Khalid
  • Application Deadline
    No more applications being accepted
  • Funded PhD Project (European/UK Students Only)
    Funded PhD Project (European/UK Students Only)

About This PhD Project

Project Description

Observational studies of medical devices may provide evidence on comparative safety in real life patients. Regulatory bodies including the FDA have showed interest in the use of evidence from studies using routinely collected data. However, the critical challenge in these studies is confounding.

Methods from statistics, econometric and artificial intelligence have helped to minimize measured confounding in drug safety and comparative effectiveness research. These methods do however not account for key confounders in medical device epidemiology not related to patient features, such as surgeon or hospital characteristics or volume. Machine learning and multi-level generalized linear models (GLM) methods have the potential to address these challenges, yet their performance warrant further research.

This PhD programme will investigate the performance of different machine learning algorithms and multi-level GLM methods for the study of the risk/s associated with the use of medical devices as used in actual practice conditions. To do so, we will use routinely collected health big data (aka “real world data”) as well as simulated datasets.

Propensity scores for a set of given treatments will be estimated using different machine learning algorithms (random forests, boosting, neural networks, support vector machines, and Bayesian additive regression trees) compared to multi-level GLM, and to the most commonly used multivariable logistic regression models.

These different analyses will be applied to clinical use cases and compared to ongoing randomized controlled trials where available. In addition, they will also be used for the analysis of simulated datasets for the study of their ability to minimize confounding and related bias in comparative device safety research.

Essential and Desired Qualifications/Experience

• Previous graduate training in epidemiology, statistics or data science

• MSc or any post-graduate training in biostatics, epidemiology or data sciences
• Experience in data analyses

Details of the Research Group

The DPhil will be jointly supervised by Prof Prieto-Alhambra (Professor of Pharmaco- and Device Epidemiology and Theme Lead for Observational Research), Dr Victoria Strauss, and Dr Sara Khalid, all members of the Centre for Statistics in Medicine, NDORMS, University of Oxford. The research will be conducted with the Pharmaco- and Device Epidemiology Research Group (, at the premises of the Botnar Research Centre, in Oxford, UK.

The allocated college is Wolfson (

Prof Daniel Prieto-Alhambra has published extensively in the field of pharmaco-epidemiology, and is recognised internationally as an authority on use of routine data for pharmaco- and device epidemiology and related methods.

Dr Victoria Strauss is a Senior Statistician. She has extensive expertise in the use, validation and development of statistical methods, both for the analysis of routinely collected data as well as in randomized clinical trial settings.

Dr Sara Khalid leads machine learning and Big Data analytics at Prof Prieto-Alhambra’s group. She has an Oxford DPhil in Engineering Science, and has an excellent track record and experience in the use of big data methods including machine learning algorithms.

Current DPhil Students within the pharmaco-epidemiology research group: 5


Training will be provided in relevant related research methodology, including the handling and analysis of large datasets, and advanced statistical techniques. Attendance at formal training courses will be encouraged, and will include the "Real world epidemiology Oxford summer school" directed by Prof Prieto-Alhambra, and the pre-conference course/s offered by the International Society of Pharmaco-epidemiology, amongst others.

A core curriculum of lectures organized departmentally will be taken in the first term to provide a solid foundation in a broad range of subjects including epidemiology, health economics, and data analysis.

How to apply

We are delighted to offer the successful candidate a 3-year studentship including EU/home tuition and college fees and a stipend of £20,000 (tax free) per annum for 3 years.

The department accepts applications throughout the year but it is recommended that, in the first instance, you contact the relevant supervisors or the Graduate Studies Officer, Sam Burnell ([Email Address Removed]), who will be able to advise you of the essential requirements.

Interested applicants should have or expect to obtain a first or upper second class BSc degree or equivalent, and will also need to provide evidence of English language competence. The application guide and form are found online and the DPhil will commence in October 2019.

For further information, please visit and/or contact Prof Prieto-Alhambra ([Email Address Removed])


- Sample size and power considerations for ordinary least squares interrupted time series analysis: a simulation study. Hawley S, et al. Clin Epidemiol. 2019 Feb 25;11:197-205. Link to full text -

- The impact of different strategies to handle missing data on both precision and bias in a drug safety study: a multidatabase multinational population-based cohort study. Martín-Merino E, et al. Clin Epidemiol. 2018 Jun 5;10:643-654. Link to full text -

- Methodological comparison of marginal structural model, time-varying Cox regression, and propensity score methods: the example of antidepressant use and the risk of hip fracture. Ali MS, et al. Pharmacoepidemiol Drug Saf. 2016 Mar;25 Suppl 1:114-21. Link to full text -

FindAPhD. Copyright 2005-2019
All rights reserved.