About the Project
We propose to utilize pan-cancer data from the 100k genome project of Genomic England in order to: 1) investigate the effect of non-coding variants in the heritability of cancer and 2) develop computational methods to analyse the impact of structural variation and copy number variation in cancer heritability.
The broad goal of this project is to understand the genetic architecture of cancer risk in patients from the United Kingdom. In order to achieve this goal, we will perform a pan-cancer analysis of whole genomes from patients in the 100,000 Genomes Project, using an integrative approach, to: 1) identify novel rare germline alleles and structural variations (SVs) associated with cancer risk ; and 2) investigate the interaction of inherited polymorphisms with somatic events in tumors. Understanding cancer heritability has an enormous impact on disease prevention, early detection and tailored treatment1. While genome-wide association studies (GWAS) have led to the identification of hundreds of risk alleles for cancer, most of these heritable polymorphisms are common (allele frequency > 5%)) and individually confer a modest increase in risk2,3. Intriguingly, it has been argued that the missing heritability of cancers could reside in understudied rare variability (allele frequency < 5%)3,4; structural variations (SVs), including copy number variations (CNVs)2; and the interaction between germline and somatic mutations5,6. Previously, analysis of these genetic patterns has been limited by the lack of high–coverage whole genome sequencing data and unsuitable methods for detecting genetic associations. However, with increased availability of next–generation sequencing data and advances in statistical methods for analyzing “big” data, our proposed project offers a unique opportunity to explore ongoing and new questions in ways that were previously inaccessible.
Aim 1: Identify novel rare germline alleles and SVs associated with different cancers in the 100,000 Genomes Project. Given the fact that most of the heritable variation associated with cancer remains unexplained2-4, we hypothesize that other variant types—namely rare variants and SVs—could additively contribute to disease outcome (H1: the “additive model hypothesis”). Moreover, we predict that the effect sizes of significantly–associated variants will be larger than the effect sizes for common disease–associated loci described in the literature. Expected significance: Research into the effect of rare variants and SVs on cancer risk is still in its infancy. Indeed, this Aim will lead to a broader understanding of the genetic architecture underlying diverse cancers by identifying novel significantly–associated loci which then can serve as biomarkers for risk assessment and prevention.
Aim 2: Characterize the link between germline and somatic mutations for different cancers in the 100,000 Genomes Project. Most cancer genomics research is focused on somatic events, such as acquired mutations, but increasing evidence suggests that inherited germline genetic variation also plays a key role in cancer risk5,6. However, little is known about the link between germline and somatic alterations, and the role that this interaction plays in disease development. To date, some evidence has shown individuals who inherit specific germline variants are more likely to have somatic mutations in certain oncogenes (e.g. PTEN and TP53)5,6. Moreover, the genes containing these germline mutations were found to participate in the same biological pathway as the somatically mutated cancer genes5,6. We similarly hypothesize that significantly-associated germline mutations (common and rare) are correlated with somatic mutations in oncogenes that are known to affect tumorigenesis. Moreover, we predict that the germline and somatic mutations will occur in related genes in the same biological pathway (H2: the “germline–somatic mutation connectivity hypothesis”). Expected significance: Aim 2 will provide insight into specific biological contexts that influence which cancer genes most effectively promote tumor development. In addition, characterizing the nature of germline–somatic interactions across different cancer will generate many testable hypotheses regarding the molecular mechanisms underlying disease risk.
Results of our analysis will advance our understanding of missing heritability in cancer development and develop novel methodologies to integrate CNVs and SVs in GWAS. You will be working in a young and dynamic multidisciplinary team. You will receive intensive bioinformatic training and receive support from leading experts of population genetics and cancer genomics.
D. LITERATURE CITED
1 Baselga, J. et al. AACR Cancer Progress Report 2015. Clin Cancer Res 21, S1-128, doi:10.1158/1078-0432.CCR-15-1846 (2015).
2 McCarroll, S. A. Extending genome-wide association studies to copy-number variation. Hum Mol Genet 17, R135-142, doi:10.1093/hmg/ddn282 (2008).
3 Sud, A. et al. Genome-wide association studies of cancer: current insights and future perspectives. Nat Rev Cancer 17, 692-704, doi:10.1038/nrc.2017.82 (2017).
4 Bodmer, W. et al. Rare genetic variants and the risk of cancer. Curr Opin Genet Dev 20, 262-267, doi:10.1016/j.gde.2010.04.016 (2010).
5 Carter, H. et al. Interaction Landscape of Inherited Polymorphisms with Somatic Events in Cancer. Cancer Discov 7, 410-423, doi:10.1158/2159-8290.CD-16-1045 (2017).
6 Geeleher, P. et al. Exploring the link between the germline and somatic genome in cancer. Cancer Discovery 7, 354–355 (2017).
7 Wu, M. C. et al. Rare-variant association testing for sequencing data with the sequence kernel association test. Am J Hum Genet 89, 82-93, doi:10.1016/j.ajhg.2011.05.029 (2011).
8 Korte, A. et al. The advantages and limitations of trait analysis with GWAS: a review. Plant Methods 9, 29, doi:10.1186/1746-4811-9-29 (2013).
9 Purcell, S. et al. PLINK: a tool set for whole-genome association and population-based linkage analyses. Am J Hum Genet 81, 559-575, doi:10.1086/519795 (2007).
10 Browning, S. R. et al. Rapid and accurate haplotype phasing and missing-data inference for whole-genome association studies by use of localized haplotype clustering. Am J Hum Genet 81, 1084-1097, doi:10.1086/521987 (2007).
11 Dennis, G., Jr. et al. DAVID: Database for Annotation, Visualization, and Integrated Discovery. Genome Biol 4, P3 (2003).
12 Herwig, R. et al. Analyzing and interpreting genome data at the network level with ConsensusPathDB. Nat Protoc 11, 1889-1907, doi:10.1038/nprot.2016.117 (2016).
Why not add a message here
Based on your current searches we recommend the following search filters.
Based on your current search criteria we thought you might be interested in these.