Anglia Ruskin University Featured PhD Programmes
University of East Anglia Featured PhD Programmes
CoSector, University of London Featured PhD Programmes
Newcastle University Featured PhD Programmes
University of Hull Featured PhD Programmes

Statistical Analysis of Literature and Social Media

  • Full or part time
  • Application Deadline
    Friday, January 31, 2020
  • Competition Funded PhD Project (Students Worldwide)
    Competition Funded PhD Project (Students Worldwide)

Project Description

The field of stylometry uses statistical techniques to analyse literature and answer questions about authorship. A typical question would be “Given two different pieces of writing, it is possible to determine whether both pieces have been written by the same author, or by two different authors?”. This is often formulated as a supervised learning problem where the goal is to build a statistical or machine learning model from a training set consisting of previous (known) works that each candidate author has written, and using this model to make inferences about the probability of them being the author of the new text.

Such techniques have previously been used for the analysis of literary works, such as detecting forgeries when a newly discovered work is claimed to have been written by some famous author (e.g. Shakespeare). Recently, there has been an increased interest in applying these techniques to the analysis of social media data. Questions here might include:

- If a person claims that their social media account has been hacked, is it possible to determine whether posts that have been made after the hack were really written by the original author?

- If we suspect that two user accounts on a platform are controlled by the same person, is it possible to confirm this using statistical analysis?

This project aims to develop new methodology for the analysis of writing, and apply it to both literary and social media applications. There are many potential projects in this area, and some potential methodological issues might include: the use of hierarchal modelling or regularisation to help scale traditional stylometric methods up to large social media datasets. Nonparametric modelling of authorship style. Unsupervised learning where we do not have a training set for each author. Etc.

Entry requirements:


• A Bachelor’s degree in Statistics, Mathematics, Physics, Computer Science or similar (a First Class or good Upper Second Class Honours degree, or the equivalent from an overseas university);

• Strong verbal and written communication skills in English.

Funding Notes

This project is funded by a University of Edinburgh scholarship which fully covers the cost of tuition fees and provides an annual stipend. This scholarship is open to home, EU, and overseas students.

Related Subjects

Email Now

Insert previous message below for editing? 
You haven’t included a message. Providing a specific message means universities will take your enquiry more seriously and helps them provide the information you need.
Why not add a message here
* required field
Send a copy to me for my own records.

Your enquiry has been emailed successfully

FindAPhD. Copyright 2005-2020
All rights reserved.