Postgrad LIVE! Study Fairs

Southampton | Bristol

Coventry University Featured PhD Programmes
University of Portsmouth Featured PhD Programmes
Imperial College London Featured PhD Programmes
University of Glasgow Featured PhD Programmes
University of Nottingham Featured PhD Programmes

Sparsity and structures in large-scale machine learning problems.

Project Description

The terminology “big data” is generally used to refer to datasets that are too large or complex for traditional data-processing technics to adequately deal with them. The application to such datasets of modern machine learning techniques therefore raises many theoretical and numerical challenges. The numerical complexity inherent to the processing of a dataset indeed generally grows polynomially with its size, compromising de facto the analysis of very large datasets. In addition, the treatment of complex datasets often results in models involving a large number of parameters, making such models difficult to train while increasing the risk of overfitting and limiting their interpretability. Since such large-scale datasets are more and more common, their efficient processing is of great importance, not only at a purely scientific level, but also for many industrial and real-life applications.

In parallel with the use of high-performance computing solutions (e.g., parallelisation, computation using graphic processing units), many alternatives exist to try to overcome the difficulties inherent to the learning-with-big-data framework. For instance, problems related to the size of the datasets might be addressed through sample-size and dimension reduction techniques, while feature extraction, low-dimensional approximation or sparsity-inducing penalisation techniques might be used to prevent the model complexity to explode. Such operations need however to be applied with great care since they might have a significant impact on the quality of the final model, their effects being in addition often intrinsically connected. To make matters worse, the existing theory surrounding such approximation techniques is generally quite modest.

The main objective of this project is to investigate the design of efficient approaches to scale-up and improve state-of-the-art machine learning techniques, while providing theoretical guarantees on their behaviour. A special emphasis will be drawn on sample size reduction and feature extraction procedures based on the notion of kernel discrepancy (also referred to as maximum mean discrepancy). Thanks to its ability to characterise representative samples, this notion has recently emerged as a powerful concept in machine learning, statistics and approximation theory (cf. reference 1. in section 4.2); combined with auto-encoder techniques, it is for instance at the core of recent developments in Generative Adversarial Networks (the MMD-GAN method). Investigating to what extent this type of approaches can be generalised is one of the main motivations behind this project.

Funding Notes

UK Research Council eligibility conditions apply.
Full awards (UK/EU fees plus maintenance stipend) are open to UK Nationals and EU students who can satisfy UK residency requirements. To be eligible for the award, EU Nationals must have been in the UK for at least three years prior to the start of the course for which they are seeking funding, including for the purposes of full-time education.


Applicants should submit an application for postgraduate study via the online application service for October 2019.

In the research proposal section of your application, please specify the project title and supervisors of this project and copy the project description in the text box provided. In the funding section, please select "I will be applying for a scholarship / grant" and specify that you are applying for advertised funding from EPSRC DTP.

If are applying for more than one Cardiff University project please note this in the research proposal section.

Related Subjects

How good is research at Cardiff University in Mathematical Sciences?

FTE Category A staff submitted: 24.05

Research output data provided by the Research Excellence Framework (REF)

Click here to see the results for all UK universities

Email Now

Insert previous message below for editing? 
You haven’t included a message. Providing a specific message means universities will take your enquiry more seriously and helps them provide the information you need.
Why not add a message here
* required field
Send a copy to me for my own records.

Your enquiry has been emailed successfully

FindAPhD. Copyright 2005-2019
All rights reserved.