Ontological modelling for data analysis


   School of Computing and Information Science

  ,  Applications accepted all year round  Self-Funded PhD Students Only

About the Project

Research Group

Computing, Informatics and Applications Research Group.

Proposed supervisory team

Dr Cristina Luca

Dr Arooj Fatima

Theme

Semantic web, big data and the analysis of free text

Summary of the research project

'Big Data' is the currently fashionable term used to describe data that exceeds the ability of traditional approaches to store and analyse due to its volume, velocity and variety. Sources typically include postings on the internet, research documents and surveys. This research seeks to utilise improvements in processing capacity to enable the effective and timely analysis of very large sets of complex data. In government and large organisations statistical methods are used to construct models that show how decisions may affect outcomes. However, these take a long time to construct and may have other technical limitations on the amount and variety of data they can consider.

The increasing use of feedback mechanisms and other Web 2.0 user generated content has created a large, unstructured but potentially valuable source of information representing the opinions of users, consumers, patients, students, travellers, holiday makers, diners etc. A site such as TripAdvisor operates an explicit star rating system but there are many other sources of data that could be useful to the manufacturer, retailer or service provider that do not provide their own degree of satisfaction. One promising approach has been through the use of online text analysis resources combined with an ontological classification which has been used to analyse the sentiment expressed in twitter posts.

Sentiment analysis of text can highlight those concepts that are associated with positive or negative sentiment and this information can be used to develop an ‘Ontological’ model that helps to identify issues and model behaviour. An ontology is a way of representing words with similar meanings between different textual representations. For example tutor, teacher, lecturer are textually distinct but have similar meanings. This allows us to build a model that summarises the key features of a domain, such as higher education satisfaction, through analysing free, unstructured text that might be found posted on social media.

Where you'll study

Cambridge

Funding

This project is self-funded.

Details of studentships for which funding is available are selected by a competitive process and are advertised on our jobs website as they become available.

Next steps

If you wish to be considered for this project, you will need to apply for our Computer and Information Science PhD. In the section of the application form entitled 'Outline research proposal', please quote the above title and include a research proposal.


Computer Science (8) Mathematics (25)

Register your interest for this project