Don't miss our weekly PhD newsletter | Sign up now Don't miss our weekly PhD newsletter | Sign up now

This project is no longer listed on FindAPhD.com and may not be available.

Click here to search FindAPhD.com for PhD studentship opportunities
  Prof N Paton  Applications accepted all year round  Competition Funded PhD Project (Students Worldwide)

About the Project

Data wrangling is the process by which the data required by an application is identified, extracted, cleaned and integrated, to yield a data set that is suitable for exploration and analysis. Although there are widely used Extract, Transform and Load (ETL) techniques and platforms, they often require significant manual work from technical and domain experts at different stages of the process. When confronted with the 4 V's of big data (volume, velocity, variety and veracity), manual intervention may make ETL prohibitively expensive. 

As a result, we are interested in enabling cost-effective approaches to data wrangling, typically through automation or suggestion. In automation, individual or multiple steps within the data wrangling process are carried out by software, using evidence about what the user requires [1]. In suggestion, given a current situation, the user is informed of possible next steps from which to choose. In both cases, it is necessary to explain the proposed actions to the user, and allow additional information from the user to steer the steps that are followed [2].  

While we have recently worked on the development of end-to-end automation for data preparation [1], we are also interested in developing techniques that integrate with the notebook environments that are widely used by data scientists.

Mathematics (25)

Funding Notes

Candidates who have been offered a place for PhD study in the Department of Computer Science may be considered for funding by the Department. Further details on funding can be found at: https://www.cs.manchester.ac.uk/study/postgraduate-research/funding/.

References

[1] Nikolaos Konstantinou, Edward Abel, Luigi Bellomarini, Alex Bogatu, Cristina Civili, Endri Irfanie, Martin Koehler, Lacramioara Mazilu, Emanuel Sallinger, Alvaro A. A. Fernandes, Georg Gottlob, John A. Keane, Norman W. Paton: VADA: an architecture for end user informed data preparation. J. Big Data 6: 74 (2019).
[2] Nikolaos Konstantinou, Norman W. Paton: Feedback driven improvement of data preparation pipelines. Inf. Syst. 92: 101480 (2020).

How good is research at The University of Manchester in Computer Science and Informatics?


Research output data provided by the Research Excellence Framework (REF)

Click here to see the results for all UK universities
Search Suggestions
Search suggestions

Based on your current searches we recommend the following search filters.