Retrieving and Compiling Knowledge of Antarctic Biology using Natural Language Processing and Geographic Information Retrieval
Ice-free areas of Antarctica (primarily formerly glaciated areas and nunataks) harbour unique and unexpectedly diverse flora and fauna, and the sparsity of suitable ice-free habitats and barriers to redistribution have created complex biogeographic patterns in terrestrial Antarctica. The distribution and activity of these endemic biological communities are overwhelmingly determined by the availability of liquid water; however, observed and predicted localized warming may significantly change the hydrological patterns in terrestrial Antarctica, leading to non-linear and heterogeneous changes in biological communities.
Over fifty years of research has shown that the Ross Sea Region (RSR, also known as the Ross Dependency) of Antarctica, administered by New Zealand, contains rich and iconic biological communities. The Antarctic Science Platform, a New Zealand government initiative, aims to leverage this wealth of information to comprehensively describe the terrestrial biological communities across the RSR, creating a framework with which warming-induced changes in the terrestrial biology can be modelled and projected.
However, information and knowledge of RSR biological communities predominantly exist as text and printed tables across a large body of scientific literature (estimated to be more than 20,000 journal articles and book chapters), a significant portion of which was published before the wide availability of global positioning system to field scientists. Retrieving and compiling knowledge of RSR biology therefore requires sophisticated and novel applications of natural language processing, geographic information retrieval, and deep learning techniques.
A PhD scholarship is available from the University of Waikato to extract and compile knowledge of biology across the RSR. The PhD candidate will work with Drs Charles Lee at the University of Waikato (expertise in Antarctic terrestrial biology) and Fraser Morgan at Manaaki Whenua - Landcare Research (expertise in informatics) to create an informatic pipeline capable of retrieving biological information from a variety of data sources, including natural language texts and structured databases, extracting (or approximating) geographic information associated with occurrence of flora and fauna, and systematically compiling retrieved data for validation.
Applicants must have an MSc or BSc Honours (or equivalent) in computer or information science. Experience with natural language processing, geographic information retrieval, geoparsing, and data mining is highly desirable. As part of this research, the candidate will likely join expeditions to the Ross Sea Region to conduct targeted sampling for validation and additional data gathering, so reasonable physical fitness is required.
This scholarship is fully funded by the Antarctic Science Platform and open to qualified individuals of any nationality or gender, but it is the successful applicant’s responsibility to secure a student visa for New Zealand. Applicants must meet all entrance requirements for the PhD programme at the University of Waikato. The scholarship covers both tuition fees and living expenses for three years.