• London School of Economics and Political Science Featured PhD Programmes
  • University of Mannheim Featured PhD Programmes
  • University of Cambridge Featured PhD Programmes
  • University of Glasgow Featured PhD Programmes
  • University of Bristol Featured PhD Programmes
  • University of Leeds Featured PhD Programmes
  • University of Leeds Featured PhD Programmes
  • Carlos III Health Institute Featured PhD Programmes
University of Manchester Featured PhD Programmes
University of Westminster Featured PhD Programmes
University College London Featured PhD Programmes
University of Kent Featured PhD Programmes
University of Bristol Featured PhD Programmes

Agents of DataSHIELD: White hat hacking the DataSHIELD infrastructure


About This PhD Project

Project Description

Rationale
Many of the exciting research questions studied in modern biomedical and social science demand sample sizes (number of analytic observations) that are so large they can only be achieved by pooling data from multiple sources. But the pooling of potentially sensitive health-related data into a central database for subsequent querying can be highly problematic. In particular, practical and ethico-legal issues, and concerns about the control of intellectual property may prohibit or discourage the physical pooling of the data. Furthermore, societal concerns about participant/patient privacy and confidentiality must be taken into proper account.

We lead an international scientific software development collaboration that has created a series of open source R packages (DataSHIELD) that use a novel analytic approach. Data never leave their original location, but are subject to remote parallelised analysis. Results are identical to a conventional analysis based directly on the pooled data.
Aims & Objectives
This project will develop open source tools to test, monitor and enhance the security of the DataSHIELD infrastructure. This may include: identifying security weaknesses; developing a protocol for security stress testing; developing systems to identify and monitor threats arising from inferential disclosure (i.e. malevolent combinations of legal data requests) and developing security solutions. Opportunities to develop DataSHIELD statistical analysis/data visualisation packages are available.
Methods
i) Adoption of a transdisciplinary approach to the creation, optimisation and implementation of open source software development, based on prior experience and best practice in other settings and disciplines.

ii) Developing and utilising a variety of skills across programming and scripting languages (e.g. python, perl, php etc) and scientific data analysis languages (particularly, R) with application to linux system administration, development operations and web operations within DataSHIELD.

iii) Working carefully and systematically around the complete life cycle of software development: systems analysis/audit, design, development, testing, implementation and maintenance.

iv) Appropriate emphasis will be placed on ensuring comprehensive and comprehensible documentation of all work and developments

References

Wolfson, M. et al, 2010. DataSHIELD: resolving a conflict in contemporary bioscience-performing a pooled analysis of individual-level data without sharing the data.

Murtagh, M.J. et al, 2012. Securing the data economy: Translating privacy and enacting security in the development of DataSHIELD.

Email Now

Insert previous message below for editing? 
You haven’t included a message. Providing a specific message means universities will take your enquiry more seriously and helps them provide the information you need.
Why not add a message here
* required field
Send a copy to me for my own records.
Email Sent

Share this page:

Cookie Policy    X