A multidisciplinary project between genetic epidemiology and environmental exposure to identify gene by air pollution interactions for a wide range of human chronic diseases.
Air pollution is the environmental factor with the greatest impact on health. The global annual health burden from particle air pollution alone is estimated to be 4.2 million deaths and 103 million disability-adjusted life-years, not counting other effects on general well-being, and associated economic cost. In the UK these figures are 29,000 deaths and 340,000 years of life lost. Air pollution impacts on a wide range of diseases (including cardiovascular, respiratory, cancer and neurological), but each individual reacts differently to air pollution and can present different adverse health outcomes.
Complex diseases are determined by the interplay of genetics, the environment and their interactions. Whilst the role of genetic variation has been widely studied in latter years through Genome-wide Association Studies (GWAS), far less attention has been paid to how the effect of genetic variation is modulated by environmental exposures such as pollution. In this study we will investigate whether genetic variation interacts with estimates of residential air and noise pollution to determine the onset of disease.
UK Biobank is a large prospective epidemiological study that aims to understand serious and life-threatening diseases. This understanding will ultimately help to improve their prevention, diagnosis and treatment. Using 500,000 UK biobank participants with measurements of residential pollution (air and noise), as well as greenspace and costal proximity we will test for gene by environment interactions.
Further, understanding of gene by pollution interactions could potentially inform air quality standards in the general population and provide each individual with the potential effects of pollution on their own health; that is, the results could potentially help to tailor advice for a person on the risk of disease depending on the pollution level of where they live.
Are there pollution by gene interactions for disease and wellbeing?
What genes are mostly affected by pollution?
The project will develop and use Linear Mixed Models and Generalised Linear Mixed Models to test for gene by pollution interactions, and implement the methodological developments into user-friendly software. Due to the large computational demands of the proposed analyses it is vital the project uses distributed memory systems and the tools and software required to capitalise on these systems. One such system is the UK National Supercomputer (http://www.archer.ac.uk/
), which we have used extensively in our research. The software developed will be applied to detect gene by pollution interactions for a range of complex diseases and measure of wellbeing.
Year 1: Literature review of population and genetic associations between environmental variables and health, quality control of the data, preliminary analysis of the data.
Year 2: Large-scale analyses, which includes testing of scripts and software, extracting and tabulation the results for each disease.
Year 3: Processing of the results generated in the large-scale analyses and follow-up of relevant and novel hypotheses arising from the work in year 2. For instance, how many diseases show evidence of gene by pollution interactions? What are the genes most affected by pollution?
A comprehensive training programme will be provided comprising both specialist scientific training and generic transferable and professional skills.
The project specific training will include:
High performance computing.
Instruction in atmospheric and related environmental pollution issues.
Canela-Xandri, O., Rawlik, K. and Tenesa, A. (2018) An atlas of genetic associations in UK Biobank, Nature Genetics 50, 1593-1599 https://www.ncbi.nlm.nih.gov/pubmed/30349118
Canela-Xandri, O., Rawlik, K., Woolliams, J. A. and Tenesa, A. (2016) Improved Genetic Profiling of Anthropometric Traits Using a Big Data Approach, Plos One 11, e0166755 https://www.ncbi.nlm.nih.gov/pubmed/27977676
Rawlik, K., Canela-Xandri, O. and Tenesa, A. (2016) Evidence for sex-specific genetic architectures across a spectrum of human complex traits, Genome Biology 17, 166 https://www.ncbi.nlm.nih.gov/pubmed/27473438
Canela-Xandri, O., Law, A., Gray, A., Woolliams, J. A. and Tenesa, A. (2015) A new tool called DISSECT for analysing large genomic data sets using a Big Data approach, Nature Communications 6, 10162 https://www.ncbi.nlm.nih.gov/pubmed/26657010