• University of Glasgow Featured PhD Programmes
  • University of Warwick Featured PhD Programmes
  • Ross University School of Veterinary Medicine Featured PhD Programmes
  • University College London Featured PhD Programmes
  • Brunel University London Featured PhD Programmes
  • King Abdullah University of Science and Technology (KAUST) Featured PhD Programmes
  • University of Glasgow Featured PhD Programmes
  • University of Nottingham Featured PhD Programmes
University of Birmingham Featured PhD Programmes
Imperial College London Featured PhD Programmes
Imperial College London Featured PhD Programmes
Cranfield University Featured PhD Programmes
University of the Highlands and Islands Featured PhD Programmes

Agents of DataSHIELD: Developing DataSHIELD infrastructures

This project is no longer listed in the FindAPhD
database and may not be available.

Click here to search the FindAPhD database
for PhD studentship opportunities
  • Full or part time
    Prof P Burton
    Dr B Wilson
  • Application Deadline
    Applications accepted all year round
  • Self-Funded PhD Students Only
    Self-Funded PhD Students Only

About This PhD Project

Project Description

Rationale
Many of the exciting research questions studied in modern biomedical and social science demand sample sizes (number of analytic observations) that are so large they can only be achieved by pooling data from multiple sources. But the pooling of potentially sensitive health-related data into a central database for subsequent querying can be highly problematic. In particular, practical and ethico-legal issues, and concerns about the control of intellectual property may prohibit or discourage the physical pooling of the data. Furthermore, societal concerns about participant/patient privacy and confidentiality must be taken into proper account.

We lead an international scientific software development collaboration that has created a series of open source R packages (DataSHIELD) that use a novel analytic approach. Data never leave their original location, but are subject to remote parallelised analysis. Results are identical to a conventional analysis based directly on the pooled data.

Aims & Objectives
Using open source tools, this project will contribute to the development of the core DataSHIELD infrastructures. This may include: the development of statistical functions or R packages development to broaden the present statistical techniques; exploring applications to new data types; method validation of existing DataSHIELD statistical functions to highlight indicators of potential disclosure. The opportunity to design a user interface for DataSHIELD is also available.

Methods
i) Adoption of a transdisciplinary approach to the creation, optimisation and implementation of open source software development, based on prior experience and best practice in other settings and disciplines.

ii) Developing and utilising a variety of skills across programming and scripting languages (e.g. python, perl, php etc) and scientific data analysis languages (particularly, R) with application to linux system administration, development operations and web operations within DataSHIELD.

iii) Working carefully and systematically around the complete life cycle of software development: systems analysis/audit, design, development, testing, implementation and maintenance.

iv) Appropriate emphasis will be placed on ensuring comprehensive and comprehensible documentation of all work and developments

References

Wolfson, M. et al, 2010. DataSHIELD: resolving a conflict in contemporary bioscience-performing a pooled analysis of individual-level data without sharing the data.

Jones, E.M. Et al., 2012. DataSHIELD - shared individual-level analysis without sharing data: a biostatistical perspective.

Share this page:

Cookie Policy    X