NGCM-0047: Computational Modelling Across ‘Omic’ Scales of Biology
Recent rapid advances in instrumentation coupled with our ability to archive and distribute vast amount of data has made biology an exciting area in which it is widely acknowledged that mathematical and computational modelling is crucial in making progress, both in developing a clear understanding of its organisational principles and in translating that understanding to the treatment of complex diseases. Data so gathered include nucleic acid sequences of DNA, amino acid sequences of proteins, relative expression levels of messenger RNA, relative abundances of the corresponding proteins, half lives of these macro molecules, rates at which RNA is translated into protein, concentrations of metabolites arising in biochemical reactions and epi genetic modifications such as DNA methylation. Most modelling work with such large datasets work with any one of these levels, solving computational problems of predicting disease susceptibility, response to treatment, and genetic and biochemical functions.
In this project we will work on novel data-driven computational methods that work across these ‘omic scales allowing for integrative analysis of data gathered at the various levels. The project will build on our recent work on integrative analysis of transcriptome (mRNA concentrations) and proteome (protein levels) where we established a robust sparse linear regression model of predicting protein concentrations. We demonstrated, using this model and a dataset of parallel mRNA and protein measurements in the model organism yeast, that outliers with respect to such a predictor can identify post translationally regulated genes. We also developed two novel algorithms, based on a difference of convex formulation, to carry out such outlier rejecting regressions.
This project will build on the above work, and will use novel matrix factorization methods as the underlying computational engine. We will integrate RNA secondary structures obtained by running ab initio structure prediction models because we hypothesize the secondary structure of RNA has to carry information about differential protein synthesis speeds.
Gunawardana, Y. and Niranjan, M. (2013) Bridging the gap between transcriptome and proteome identifies post-translationally regulated genes, Bioinformatics 29(23): 3060-3066.
Gunawardana, Y, Fujiwara, S., Takeda, A., Woo, J., Woelk, C. and Niranjan, M. (2015) Outlier detection at the transcriptome proteome interface, Bioinformatics 31(15): 2530-2536.
This project is run through participation in the EPSRC Centre for Doctoral Training in Next Generation Computational Modelling (http://ngcm.soton.ac.uk). For details of our 4 Year PhD programme, please see http://www.findaphd.com/search/PhDDetails.aspx?CAID=331&LID=2652
For a details of available projects click here http://www.ngcm.soton.ac.uk/projects/index.html
Visit our Postgraduate Research Opportunities Afternoon to find out more about Postgraduate Research study within the Faculty of Engineering and the Environment: http://www.southampton.ac.uk/engineering/news/events/2016/02/03-discover-your-future.page
How good is research at University of Southampton in General Engineering?
FTE Category A staff submitted: 192.23
Research output data provided by the Research Excellence Framework (REF)
Click here to see the results for all UK universities