Providing Cost-Effective, Highly-Automated Approaches to Data Wrangling
Data wrangling is the process by which the data required by an application is identified, extracted, cleaned and integrated, to yield a data set that is suitable for exploration and analysis. Although there are widely used Extract, Transform and Load (ETL) techniques and platforms, they often require manual work from technical and domain experts at different stages of the process. When confronted with the 4 V’s of big data (volume, velocity, variety and veracity), manual intervention may make ETL prohibitively expensive. As a result, we are interested in providing cost-effective, highly-automated approaches to data wrangling; this involves significant research challenges requiring fundamental changes to established areas, including data integration and cleaning, and to the ways in which these areas are brought together. To enable well-informed decisions to be made by automated techniques, we propose to investigate comprehensive support for context awareness within data wrangling, building on adaptive, pay-as-you-go solutions that automatically tune the wrangling process to the requirements and resources of the specific application.
This research project is one of a number of projects at this institution. It is in competition for funding with one or more of these projects. Usually the project which receives the best applicant will be awarded the funding. The funding is available to citizens of a number of European countries (including the UK). In most cases this will include all EU nationals. However full funding may not be available to all applicants and you should read the full department and project details for further information.
How good is research at The University of Manchester in Computer Science and Informatics?
FTE Category A staff submitted: 44.86
Research output data provided by the Research Excellence Framework (REF)
Click here to see the results for all UK universities