This project is no longer listed on FindAPhD.com and may not be available.

Click here to search FindAPhD.com for PhD studentship opportunities

Dr H Shanahan Applications accepted all year round Competition Funded PhD Project (Students Worldwide)

Egham United Kingdom

About the Project

While genomic data give us important insights into the Molecular Biology of a cell, transcriptomic data (which tells us what section of the genome is being actively transcribed) gives us a first dynamic picture of the cell. It has been hoped that such data sets will provide large-scale gene networks and hence provide the framework for many of the goals of Systems Biology.

There exist a large publicly-available data set of transcriptomic data using micro-array technology and as the cost of sequencing has collapsed (see figure 1) RNA-seq data which can provide potentially much more sensitive results are likely to become common-place and applied in a variety of areas such as biomedical research, clinical applications and agrotechnology.

There exist two significant challenges to the successful application of this data. In the first instance, present transcrptomic datasets have not been as successful as hoped in inferring gene interactions. One cause for this are biases that exist in the data due to anomalous hybridisations (shown in figure 2). It is likely that similar biases will exist for RNA-seq data. Removing such biased data, and understanding other such biases could substantially improve the quality of predications. A second challenge is that RNA-seq datasets are inherently much larger than micro-array data sets and global studies based over many different experiments will require 100's of Tbytes (if not Pbytes) of storage. Such data sets cannot be easily moved via the Internet. It is likely then that such data will be stored in data-centres that are co-located with where they are generated. An exemplar of this is the BGI, a private company based in Shenzhen, China who are the major providers on next generation sequencing facilities in the world who are also offering a cloud-computing service to analyse their data.

The project is composed of three parts. In the first instance, the student will examine a variety of different cloud computing and distributed computing platforms ranging from purely commercial solutions (EC2, Azure, Google cloud) to open source solutions (OpenNebula) and different paradigms (PaaS to SaaS) to determine the optimal configuration for the analysis of such data.

In the second instance, the student will scale up a pilot analysis carried out on one type micro-array to a wide variety of micro-arrays (GeneChips, SNPChips and tiling arrays) using the cloud platform determined previously to be optimal for these types of problems. The emphasis will be on providing summary measures that can be used reliably for quality control purposes.

Finally, the student will then extend the analysis to the analysis of RNA-seq data to see if this data is also susceptible to the kind of sequence biases that occur in micro-arrays.

Where will I study?

Royal Holloway, University of London

Royal Holloway is one of the UK’s leading research-intensive universities, and home to some of the world’s foremost authorities in the sciences, arts, business, economics and law. We’re proud of our groundbreaking history, for opening up university education to those who didn’t think they belonged, challenging conventions and asking the difficult questions. Today, we continue to empower students and transform lives through inclusive education, an active and close-knit research community and local and global partnerships. Our world-class research tackles real-world problems, seeking creative solutions to complex challenges, alleviating inequalities, and living sustainably in today’s interconnected and digital world. We set new research agendas inspired by our commitment to equality, academic excellence, and social justice. The Doctoral School brings students from all departments together to stage interdisciplinary research symposia and social events, to spark fresh conversations, new theories and unique collaborations.

The analysis of large transcriptomic data sets using distributed computing

Royal Holloway, University of London Dept of Computer Science

This project is no longer listed on FindAPhD.com and may not be available.

About the Project

Where will I study?

Royal Holloway, University of London

The analysis of large transcriptomic data sets using distributed computing

Royal Holloway, University of London Dept of Computer Science

This project is no longer listed on FindAPhD.com and may not be available.

About the Project

Where will I study?

Royal Holloway, University of London

Select your nearest city

Do you want hassle-free information and advice?

Sign in to view and filter all scholarship opportunities