or
Looking to list your PhD opportunities? Log in here.
Project description:
Present-day deep learning models often operate as formidable black boxes, excelling in performance metrics but lacking transparency in their decision-making processes. The field of interpretability aims to illuminate the inner workings of these black boxes. Interpretability is crucial for two main reasons: firstly, it offers insights into the limitations of existing models, guiding research directions; and secondly, it aids in creating more resilient models against adversarial attacks by exploiting a model's vulnerabilities.
The aims of this project are twofold:
(1) To develop techniques for enhanced interpretation of model decisions, specifically Transformer-based models. This can be in the form of token attribution analysis, i.e., to assess which parts of the input were responsible for making the final decision, or other probing analysis experiments to shed light on the inner workings of these models.
(2) To use interpretation/explanation techniques to delve deeper into the issues that are commonly associated with LLMs. For instance, their lack of understanding of numerical concepts, semantic ambiguity, or common sense. This can either be an analytical study to explain the reasons behind these shortcomings or a technical piece of work on improving these models with respect to the shortcomings.
Deliverables:
The outputs of this project will be mostly published at NLP and AI conferences and journals. Successful techniques also have the chance to be integrated into existing interpretation frameworks, such as Inseq and Captum.
Contact for more information on the project: Dr Taher Pilehvar; [Email Address Removed]
Academic criteria: A 2:1 Honours undergraduate degree or a master's degree, in computing or a related subject. Applicants with appropriate professional experience are also considered. Degree-level mathematics (or equivalent) is required for research in some project areas.
Applicants for whom English is not their first language must demonstrate proficiency by obtaining an IELTS score of at least 6.5 overall, with a minimum of 6.0 in each skills component.
How to apply:
Please contact the supervisors of the project prior to submitting your application to discuss and develop an individual research proposal that builds on the information provided in this advert. Once you have developed the proposal with support from the supervisors, please submit your application following the instructions provided below
Please submit your application via Computer Science and Informatics - Study - Cardiff University
In order to be considered candidates must submit the following information:
If you have any questions on the application process, please contact [Email Address Removed]
Research output data provided by the Research Excellence Framework (REF)
Click here to see the results for all UK universitiesBased on your current searches we recommend the following search filters.
Check out our other PhDs in Cardiff, United Kingdom
Start a New search with our database of over 4,000 PhDs
Based on your current search criteria we thought you might be interested in these.
Developing the next generation of pedestrian behaviour models for revival of high streets and sustainable transport [Self-Funded Students Only]
Cardiff University
Advancing Diagnostic Radiology through Artificial Intelligence and Machine Learning [SELF-FUNDED STUDENTS ONLY]
Cardiff University
Self-funded PhD- Exploring bilingual language production and switching mechanisms via online approach
University of Bristol