Don't miss our weekly PhD newsletter | Sign up now Don't miss our weekly PhD newsletter | Sign up now

  Identifying Copy Number Variants using Whole Exome Sequencing Data


   Department of Biostatistics

This project is no longer listed on FindAPhD.com and may not be available.

Click here to search FindAPhD.com for PhD studentship opportunities
  Dr A Auer-Fowler, Prof Andrew Morris  Applications accepted all year round  Self-Funded PhD Students Only

About the Project

Copy Number Variants (CNVs) are a common form of genetic variation and are known to contribute to genetic diseases. Whole Exome Sequencing (WES) is a relatively cheap form of genetic sequencing which targets just the coding regions of the genome. WES is becoming increasingly popular, particularly in clinical applications where the causal genes are known.
Identifying CNVs from WES is currently unreliable; therefore CNVs are often ignored in WES studies or detected by an alternative and potentially costly technology. Therefore, improving CNV calling from WES data has the potential to reveal important CNVs and reduce the cost by removing the need for alternative technologies.

The break points of the CNVs generally lie outside of the coding regions targeted in WES, therefore it is currently assumed that the signals associated with the break points will not be observed. However, WES generates a large number of off target reads (40-60% of all reads) some of which will contain additional information for the identification of CNVs. These off target reads have proved informative in other applications but are largely ignored in the field of CNV detection.

The aim of this project is to improve CNV calling from WES data by incorporating multiple signals and using all reads generated. Additionally, these methodological improvements will be applied to large WES studies and therefore contribute to our understanding of the role of CNVs in complex human traits.
Scientific objectives1. Develop a statistical model for CNVs in WES data which integrates multiple signals from on- and off-target reads. Bayesian approaches are effective in incorporating prior information, such as sequence content, and hierarchically linking multiple samples, and therefore adoption of a Bayesian framework will increase robustness of the model. The 1,000 genomes will act as a ‘gold standard’ data set for bench marking and optimization.
2. Implement efficient software for this model, allowing it to be applied to large numbers of samples.
3. Apply it to: (i) 2,500 WES from the Estonian Biobank, for which detailed disease phenotypes and lifestyle data are available; and (ii) 52,000 WES from the T2DGENES Consortium to study the contribution of CNVs to T2D risk and related metabolic traits.
Person specificationThe successful candidate is likely to hold a 1st or 2:1 degree in a relevant discipline (statistics or mathematics or computing or bioinformatics) preferably with a Masters degree. Experience of programming is essential (e.g. R, C++, Python).
Training and supportThe student will receive support from supervisors to enable them to understand their research, publish their work, attend scientific conferences. Further training in statistics and genetics will be provided through targeted courses run by the Department of Biostatistics and the Institute of Translational Medicine. Additionally, Liverpool University run courses on broader subjects such as scientific writing and computing programming skills if required. Being embedded in the statistical genetics group will allow the student to benefit from the expertise of the group as a whole. The student will receive broader exposure to statistical genomics as part of the North of England Genetic Epidemiology Group (NEGEG), which offers the opportunity to younger researchers to regularly present their research and to network with other students and postdoctoral researchers based at universities in the North of England.

Applicants should send a CV, academic transcripts, a letter of motivation and two names of referees who can send letters of recommendation to Nyree Collinson [Email Address Removed].


Funding Notes

Successful candidate will be provided with state-of-the-art resources for computing, and support for research, training courses and conferences, as well as tuition fees (at Home/EU rate) and a monthly stipend for 3 years.

Where will I study?