Don't miss our weekly PhD newsletter | Sign up now Don't miss our weekly PhD newsletter | Sign up now

  Scalable Online Machine Learning (EPSRC CDT in Distributed Algorithms)


   EPSRC CDT in Distributed Algorithms

This project is no longer listed on FindAPhD.com and may not be available.

Click here to search FindAPhD.com for PhD studentship opportunities
  Prof S Maskell  No more applications being accepted  Funded PhD Project (Students Worldwide)

About the Project

This PhD project is part of the CDT in Distributed Algorithms: The What, How and where of Next-Generation Data Science.

The University of Liverpool’s Centre for Doctoral Training in Distributed Algorithms (CDT) is working in partnership with the STFC Hartree Centre and 20+ external partners from the manufacturing, defence and security sectors to provide a 4-year innovative PhD training programme that will equip over 60 students with: the essential skills needed to become future leaders in distributed algorithms; the technical and professional networks needed to launch a career in next generation data science and future computing; and the confidence to make a positive difference in society, the economy and beyond.

The successful PhD student will be co-supervised and work alongside our external partner GCHQ and relates to extending the state-of-the-art to enable machine learning to fully capitalise on the information present in never-ending data streams. The additional data that arrives over time contains information that should facilitate improved machine learning. Not using this information gives rise to consistent yet surprising errors: this typically occurs when the training data is small relative to the algorithm’s empirical experience. Concept drift can also occur: the passage of time also provides scope for the phenomena that give rise to the data to change. The result of concept drift is that, even if the phenomena of interest do not change, because the statistical environment changes, the performance of the machine learning is prone to degrading. Furthermore, since the quantity of historic data is ever growing, given finite data storage and computational resources, innovative techniques are needed to summarise the information present in data and currently pertinent without requiring all the raw data ever received to be stored.

The proposed solution involves three novel components. First, to reduce the storage and computation that would otherwise be required, the pertinent data received up to the current time will be summarised in an adaptive tree-based data structure. This definition of this data structure will build on previous work on Approximate Bayesian Computation and involve approximating the information present in the raw data with summaries. Second, to ensure concept drift is catered for, these summaries will explicitly relate to the time-derivatives of the parameters that the machine learning is attempting to estimate. Finally, to maximise performance, previous related work involving variational inference, which will be extended to consider the aforementioned data structures, will also be adapted to consider numerical Bayesian inference.

The approach will be applied to real-world datasets involving combinations of: near-constant parameters for which concept drift is not relevant (eg related to rare events of interest); parameters that fluctuate smoothly over long timescales (eg diffusive spread of memes); sudden shifts in concepts (eg new memes appearing). Such datasets are anticipated to involve large and continually growing text corpuses (eg social media).

Students are based at the University of Liverpool and part of the CDT and Signal Processing research community. Every PhD is part of a larger research group which is an incredibly social and creative group working together solving tough research problems. Students have 2 academic supervisors and an industrial partner who provides co-supervision, placements and the opportunity to work on real world challenges. In addition, students attend technical and professional training to gain unparalleled expertise to make a difference now and in the future.

This studentship is due to commence 1 October 2021 (Covid-19 Working Practices available).

Contact Dr Simon Maskell: [Email Address Removed] in the first instance or visit the CDT website for Director, Student Ambassador and Centre Manager details.

Visit the CDT website for application instructions, FAQs, interview timelines and tips.


Computer Science (8) Engineering (12) Mathematics (25)

Funding Notes

Visit the CDT website for funding and eligibility information: https://www.liverpool.ac.uk/distributed-algorithms-cdt/apply/

Where will I study?

Search Suggestions
Search suggestions

Based on your current searches we recommend the following search filters.