The focus of this project is on scalable machine learning of time-series models. Such machine learning necessarily involves assessing candidate models by processing data using each of many candidate models. This project is co-created and co-funded by DSTL.
This assessment is challenging because optimal processing of a time-series (e.g., using a Kalman smoother in the context of linear Gaussian models) involves iterative consideration of the data in time order, making it hard to capitalise on parallel computational resources. Conversely, techniques that are readily parallelised (e.g., belief propagation) can fail to provide an accurate assessment of a model’s efficacy, particularly in contexts where the phenomenology that the model is trying to describe generates artefacts in the data that are only visible over long timescales. This project will investigate hybrid approaches that combine the ability to capitalise on parallel resources with near-optimal processing.
DSTL have a number of specific use cases (e.g., related to both cyber and physical surveillance) that will help focus the research comprising the PhD. The aim is to use the diversity of individual use cases to exemplify and demonstrate the generic utility of the research. One exemplar use case involves using historic GP and hospital admissions data to learn the parameters of a partially-observed non-linear epidemiological model for flu. Another exemplar use case involves learning the patterns-of-life associated with benign access to MoD’s intranet systems with a view to detecting anomalous activity that might be indicative of a cyber-attack. In the context of the use cases, DSTL will provide, for example, data, benchmark algorithms and metrics for comparison.
This project is part of the EPSRC Funded CDT in Distributed Algorithms: The What, How and where of Next-Generation Data Science. https://www.liverpool.ac.uk/research/research-themes/digital/cdt-distributed-algorithms/
The University of Liverpool is working in partnership with the STFC Hartree Centre and other industrial partners from the manufacturing, defence and security sectors to provide a 4 year innovative PhD training course that will equip over 60 students with the essential skills needed to become future leaders in data science, be it in academia or industry.
Every project within the centre is offered in collaboration with an Industrial partner who as well as providing co-supervision will also offer the unique opportunity for students to access state of the art computing platforms, work on real world problems, benchmarking and data. Our graduates will gain unparalleled experiences working across academic disciplines in highly sought-after topic areas, answering industry need.
As well as learning from academic and industrial world leaders, the centre has a dedicated programme of interdisciplinary research training including the opportunity to undertake modules at the global pinnacle of Data science teaching. A large number of events and training sessions are undertaken as a cohort of PhD students, allowing you to build personal and professional relationships that we hope will lead to research collaboration either now or in your future.
The learning nurtured at this centre will be based upon anticipation of the hardware recourses arriving on desks of students after they graduate, rather than the hardware available today.