Discovering hidden patterns in high-dimensional data for materials science
Dr V Kurlin
Prof Andy Cooper
No more applications being accepted
Funded PhD Project (European/UK Students Only)
Chemists are dreaming about discovering nanoporous materials with desired properties. Past methods seem a slow search for a needle in a haystack, because the space of hypothetical materials is huge. The Cambridge Structural Database (CSD) has more than 900K existing crystals. To visualise the CSD shape, we will identify clusters of similar crystals and explore higher level patterns, e.g. circular chains of clusters or longer branches leading to exotic materials. The student will design and implement an interactive tool with a slider interface to vary similarity thresholds and find hidden patterns, which will guide a further more efficient search.
The student will be based in the Department of Computer Science or the new £68M research institute MIF (Materials Innovation Factory) within the University of Liverpool, UK. The supervisors are Dr Vitaliy Kurlin (http://kurlin.org) and Prof Andy Cooper FRS (academic director of the MIF). The project can be considered as a part of the new Centre for Topological Data Analysis
(https://www.maths.ox.ac.uk/groups/topological-data-analysis, joint with the Universities of Oxford and Swansea) funded by the recent £2.8 EPSRC grant "Application-driven Topological Data Analysis" (EP/R018472/1, https://news.liverpool.ac.uk/2018/01/29/liverpool-partners-in-new-centre-for-topological-data-analysis/).
The Centre for Topological Data Analysis will study the shape of data, through the development of new mathematics and algorithms, and build on existing data science techniques in order to obtain and interpret the shape of data. Modern science and technology generates data at an unprecedented rate. A major challenge is that this data is often complex, high dimensional, may include temporal and/or spatial information. The "shape" of the data can be important but it is difficult to extract and quantify it using standard machine learning or statistical techniques. For example, an image of blood vessels near a tumour looks very different than an image of healthy blood vessels; statistics alone cannot quantify this difference and the new methods are required.
A theoretical field of mathematics that enables the study of shapes is geometry and topology. The ability to quantify the shape of complicated objects is only possible with advanced mathematics and algorithms. Topological Data Analysis (TDA), enables one to use methods of topology and geometry to study the shape of data. In particular, a method known as persistent homology, provides a summary of the shape of the data (e.g., features such as holes) at multiple scales. A key success of persistent homology is the ability to provide robust results, even if the data are noisy. There are theoretical and computational challenges in the application of these algorithms to large scale, real-world data. The aim is to build on current persistent homology tools, extending it theoretically, computationally, and adapting it for applications. Our team is composed of experts in pure and applied mathematicians, computer scientists, and statisticians.
Applications are welcomed from students with a 2:1 or higher (60% grade point average) masters or BSc degree or equivalent in Mathematics, Statistics, Computer Science or Computational Chemistry. The essential requirements are programming experiences (preferably C/C++, or Python, Java, Matlab, R) and excellent communication skills to work in a large team. The project will involve a close collaboration with colleagues from different areas and industry partners, e.g. CCDC (Cambridge Crystallographic Data Centre), IBM Research UK (Hartree centre in the Daresbury lab), STFC (Science and Technology Facilities Council).
How to apply.
Enquiries can be sent to [Email Address Removed] before making a formal application. Applications can be submitted as described at https://www.liverpool.ac.uk/computerscience/postgraduate/phdstudy/applications. Applications should list Dr Vitaliy Kurlin as the potential supervisor and choose the option "School funded PhD" when asked how you will fund the PhD. Applications must contain a cover letter, a curriculum vitae or resume, copies of undergraduate and graduate transcripts, a 1-2 page research statement describing how the applicant’s qualifications and research interests would fit the project, a copy of the applicant’s bachelor or master’s thesis and the names and contact information of academic references.
The PhD is funded for 3 years from October 2018 by the school of Electrical Engineering, Electronic and Computer Science at the University of Liverpool (UK). The funding of 20K GBP per year covers the tuition fees for UK/EU students (about 4.2K GBP per year) and a tax-free bursary. Additional travel funds can be available from various grants in the university for presenting research work at top conferences.