Turning Science Fiction into Data Science

This project is no longer listed on and may not be available.

  • Full or part time
    Dr E Finer
    Dr C Helling
    Dr V A Smith
  • Application Deadline
    No more applications being accepted
  • Funded PhD Project (Students Worldwide)
    Funded PhD Project (Students Worldwide)

Project Description

What happens when we treat books as data? Can literary theory ever meet scientific standards? By applying a range of historical and current methods of quantitative analysis to a body of science fiction, this project will investigate claims for objectivity in literary theory. Given that science fiction has successfully predicted future scientific discoveries, the data sets created by different methods of quantitative analysis have the potential to influence not only the future of science fiction, but the course of science itself.
By combining the expertise of a literary researcher, exoplanet scientist, and computational biologist as co-supervisors, this doctoral project will be uniquely placed to discover and critique interrelationships between these disciplines. All three co-supervisors are members of the Centre for Exoplanet Science; the project will focus on science fiction about exoplanets. A central question for the Centre and this project is how would human society respond to life beyond our planet? Science fiction provides a multitude of thought experiments addressing this very question, ranging in outlook from negative (for example, H.G Wells’ War of the Worlds) to positive (for example, C. Sagan’s Contact). Analysis of science fiction as a dataset, using computational techniques from both digital humanities and artificial intelligence, can give us hints as to a societal gestalt potentially predictive of reactions to real extraterrestrial contact.
The Russian Formalists prioritised scientific rigour and objectivity in their study of literature. They viewed literary history as an evolving system rather than a succession of individual works and their authors; they rejected interpretation in favour of analysis of form. Their publications included Tomashevskii’s quantitative analysis of rhymes and syllables presented as a series of 81 graphs, and Propp’s better known derivation of the 31 common plot functions present in Russian folktales. Lenin’s speeches were data-mined for the recurring collocations that made them rhetorically successful. More recently, Moretti has treated the European novel as big data and applied a Darwinian evolutionary model based on the divergence of biological species and their survival through the mechanism of natural selection to literary history. In a landmark study, Spurgeon created a procedure to “assemble, sort, and examine” Shakespeare’s metaphors not “to point or to illustrate any preconceived idea or thesis, but they are studied with a perfectly open mind to see what information they yield.” Despite their claims of objectivity and comparison of data from different bodies of literature, these approaches were highly specific and not repeated. Even predictive statements were speculative, such as Brik’s “Eugene Onegin would have been written even if there had been no Pushkin”. The project will combine modern data science with literary theory to provide principled, replicable, and predictive analyses. It aims to:
1. Modify and advance Formalist methods with benefit of digital tools previously unavailable and apply to two representative science fiction corpuses (for example, Russian and English), producing mineable datasets.
2. Apply machine learning to these datasets to reveal cultural markers relevant to society’s imagination of and potential reaction to extraterrestrial life and develop predictive Bayesian networks capable of reflecting a corpus’s attitude towards humanity’s interaction with extraterrestrials (for example, by providing probability distribution over agonistic/peaceful outcome based on randomly generated scenarios).
Outcomes will include:
• Recalibration of our contemporary vision of Russian Formalism in the context of the 21st century humanities and sciences;
• New approaches to Digital Humanities;
• A roadmap for integrating current scientific research and new science fiction; and
• New methods for exoplanet scientists to communicate their findings to society.

How good is research at University of St Andrews in Physics?
(joint submission with University of Edinburgh)

FTE Category A staff submitted: 36.90

Research output data provided by the Research Excellence Framework (REF)

Click here to see the results for all UK universities

