Don't miss our weekly PhD newsletter | Sign up now Don't miss our weekly PhD newsletter | Sign up now

  Data Lake Exploration with Modern Artificial Intelligence Techniques

   Department of Computer Science

  ,  Applications accepted all year round  Competition Funded PhD Project (Students Worldwide)

About the Project

Data Lakes are emerging as data management infrastructures for storing data in various schemata and structural forms. Their goal is to serve as a single entry point for the data analysis process across highly heterogeneous datasets, supporting analytical tasks following a schema-on-read approach, in which data is discovered and integrated when it is to be used. Due to their semantic and structural heterogeneity, Data Lakes bring integration challenges to a new scale of complexity. With the fast development of Artificial Intelligence in recent years, many modern techniques such as Large Language Models and Knowledge Graphs have shown great power in dealing with many problems, including those in data management and data science. These techniques provide a new and promising direction for addressing the challenges in Data Lake. 

The Information Management Group at the University of Manchester invites applications for PhD candidates in the area of Artificial Intelligence for Exploration of Data Lakes. PhD projects in this area will explore how contemporary techniques building on Language Models (such as Prompt Learning and Instruction Following Fine-tuning), Knowledge Engineering (such as Knowledge Graphs) and Data Engineering can brought together to explore deep semantics of tabular data for more efficient and effective for Data Lake management. 

Examples of research challenges include: 1) how to embed tables in a vector space with their schemas, instances and associated metadata; 2) how to combine semantics from Language Models and Knowledge Graphs for semantic table annotation and schema inference; and 3) how to characterize complex relationships between tables and table attributes and to use these to inform data integration.

Applicants are expected to have:

1. An excellent undergraduate degree in Computer Science or Mathematics (or related discipline), and preferably, a relevant M.Sc. degree.

2. Confidence and independence in programming complex systems in Java or Python.

3. Previous academic or industry experience in at least one of the relevant topics such as Machine Learning, Natural Language Processing, Semantic Web, Knowledge Engineering, Data Engineering and Data Science.

4. Excellent report writing and presentation skills.

Please note that applicants must additionally satisfy the standard requirements for postgraduate studies at the University of Manchester, such as a first-class or high upper-second class (or an equivalent international qualification) and English language qualifications, as stated in the Postgraduate Research Degree guidelines.

Entry requirements:

The minimum academic entry requirement for a PhD in the Faculty of Science and Engineering is an upper second-class honours degree (or international equivalent) in a discipline directly relevant to the PhD OR any upper-second class honours degree (or international equivalent) and a Master’s degree merit (or international equivalent) in a discipline directly relevant to the PhD.

How to apply:

You will need to submit an online application through our website here:

When you apply, you will be asked to upload the following supporting documents: 

• Final Transcript and certificates of all awarded university level qualifications

• Interim Transcript of any university level qualifications in progress

• CV

• You will be asked to supply contact details for two referees on the application form (please make sure that the contact email you provide is an official university/ work email address as we may need to verify the reference)

• English Language certificate (if applicable)

Your application form must be accompanied by a number of supporting documents by the advertised deadlines. Without all the required documents submitted at the time of application, your application will not be processed and we cannot accept responsibility for late or missed deadlines. Incomplete applications will not be considered. If you have any queries regarding making an application please contact our admissions team

We strongly recommend that you contact the supervisor to discuss the application before you apply. The email address for Prof Normal Paton is

Equality, diversity and inclusion is fundamental to the success of The University of Manchester, and is at the heart of all of our activities. We know that diversity strengthens our research community, leading to enhanced research creativity, productivity and quality, and societal and economic impact. We actively encourage applicants from diverse career paths and backgrounds and from all sections of the community, regardless of age, disability, ethnicity, gender, gender expression, sexual orientation and transgender status.

We also support applications from those returning from a career break or other roles.

Computer Science (8) Mathematics (25)

Funding Notes

The University of Manchester offers a range of funding opportunities, please contact Prof Normal Paton is to discuss the options.

How good is research at The University of Manchester in Computer Science and Informatics?

Research output data provided by the Research Excellence Framework (REF)

Click here to see the results for all UK universities

Register your interest for this project