University of East Anglia Featured PhD Programmes
University of Southampton Featured PhD Programmes
University College London Featured PhD Programmes

Data-Science Approaches to Better Understand Multimorbidity and Treatment Outcomes in Patients with Rheumatoid Arthritis

This project is no longer listed on and may not be available.

Click here to search for PhD studentship opportunities
  • Full or part time
    Prof K Hyrich
    Dr G Nenadic
    Dr N Geifman
    Dr Stephanie Shoop-Worrall
  • Application Deadline
    No more applications being accepted
  • Competition Funded PhD Project (European/UK Students Only)
    Competition Funded PhD Project (European/UK Students Only)

Project Description

Chronic inflammatory diseases such as rheumatoid arthritis (RA) are potentially life-ruining. Not only is the condition itself associated with significant pain and disability, patients are more likely to be diagnosed with other co-morbid conditions. This multimorbidity can both be caused by and influence disease status (such as remission) and medications. The ways these comorbidities develop and cluster over time and their association with medications is not well understood. Data-science approaches offer a new opportunity to explore this further, with the potential to reveal previously unrecognised patterns of illness over time.

The British Society for Rheumatology ( has been capturing significant clinical data from >30000 patients with RA since 2001. Free-text adverse event and comorbidity data (>140000 records) have been recorded and manually coded to the Medical Dictionary for Regulatory Affairs (MedDRA); however manual coding is laborious and introduces the potential for inconsistencies in disease identification over time. Natural language processing (NLP) through use of text-mining software to automatically “machine-code” adverse event and comorbidity data offers a new opportunity to better harmonise outcome data over time. Unsupervised machine learning approaches, such as Latent Class Analysis and Topological Data Analysis, can subsequently be applied to these data to look for relations and patterns of disease clustering and their relationship to medication and the underlying arthritis over time.

Proposed 3-year PhD Plan:

Year 1: Literature reviews on (1) multimorbidity in RA and (2) utility of NLP/ML for discovery of “disease status”. Tailoring and application of text-mining software to free text data in study.

Year 2: Clustering based on comorbidity patterns, application of latent class analysis (and other methods) to cross-sectional snapshot of all accumulated events and relating back to outcomes (drug exposure, remission status, etc.).

Year 3: Longitudinal analysis of morbidity patterns - identifying disease/adverse event sequences using ML approaches; write up and submit PhD

Entry Requirements
Applicants are expected to hold, or about to obtain, a minimum upper second class undergraduate degree (or equivalent) in epidemiology, statistics, data science, computing or other related field. A Masters degree in a relevant subject and/or experience in a related subject area/discipline is desirable. This PhD would be attractive to candidates with experience and training in data science, and also for those from an epidemiological or statistical background looking to increase their knowledge and experience in data science.

For information on how to apply for this project, please visit the Faculty of Biology, Medicine and Health Doctoral Academy website ( Informal enquiries may be made directly to the primary supervisor. You MUST also submit an online application form - choose PhD Epidemiology.

Funding Notes

This project is funded by the Centre for Epidemiology Versus Arthritis. Studentship funding is for a duration of three years to commence in September 2020 and covers UK/EU tuition fees and a UKRI stipend £15,285 per annum. Due to funding restrictions the studentship is open to UK and EU nationals with 3 years residency in the UK.

As an equal opportunities institution we welcome applicants from all sections of the community regardless of gender, ethnicity, disability, sexual orientation and transgender status. All appointments are made on merit.


1: Zador Z, Landry A, Cusimano MD, Geifman N. Multimorbidity states associated with higher mortality rates in organ dysfunction and sepsis: a data-driven analysis in critical care. Crit Care. 2019 Jul 8;23(1):247.

2: Geifman N, Lennon H, Peek N. Patient Stratification Using Longitudinal Data - Application of Latent Class Mixed Models. Stud Health Technol Inform. 2018;247:176-180

3: Low AS, Symmons DP, Lunt M, Mercer LK, Gale CP, Watson KD, Dixon WG, Hyrich KL. Relationship between exposure to tumour necrosis factor inhibitor therapy and incidence and severity of myocardial infarction in patients with rheumatoid arthritis. Ann Rheum Dis. 2017 Apr;76(4):654-660.

4: Mercer LK, Galloway JB, Lunt M, Davies R, Low AL, Dixon WG, Watson KD; Symmons DP, Hyrich KL. Risk of lymphoma in patients exposed to antitumour necrosis factor therapy: results from the British Society for Rheumatology Biologics Register for Rheumatoid Arthritis. Ann Rheum Dis. 2017 Mar;76(3):497-503.

5: Karystianis G, Sheppard T, Dixon WG, Nenadic G. Modelling and extraction of variability in free-text medication prescriptions from an anonymised primary care electronic medical record research database. BMC Med Inform Decis Mak. 2016 Feb 9;16:18.

FindAPhD. Copyright 2005-2020
All rights reserved.