FindAPhD Weekly PhD Newsletter | JOIN NOW FindAPhD Weekly PhD Newsletter | JOIN NOW

Inferring the gas mass surface density of dense star-forming clouds at high-angular resolution using machine learning

   Cardiff School of Physics and Astronomy

This project is no longer listed on and may not be available.

Click here to search for PhD studentship opportunities
  Dr N Peretto, Dr T Davis, Dr P Clark  No more applications being accepted  Competition Funded PhD Project (Students Worldwide)

About the Project

The UKRI CDT in Artificial Intelligence, Machine Learning and Advanced Computing (AIMLAC) aims at forming the next generation of AI innovators across a broad range of STEMM disciplines. The CDT provides advanced multi-disciplinary training in an inclusive, caring and open environment that nurture each individual student to achieve their full potential. Applications are encouraged from candidates from a diverse background that can positively contribute to the future of our society. 

H2 column density (NH2), or equivalently mass surface density, is a fundamental parameter of star formation models. On galaxy scales, it is believed to be a good predictor of star formation rates (e.g. [1]). On parsec scales and below, it is often used to identify sub-structures, such as filaments and cores, i.e. the direct progenitors of individual stars. Hence, our ability to derive accurate mass surface is central to our understanding of star formation throughout the Galaxy and beyond. For the past 10 years, H2 column density images of star-forming clouds have been mostly obtained using Herschel continuum data (e.g. [2]). While extremely powerful, Herschel column density images have the drawback to be, at best, at 18” resolution, which is too low to probe the internal structures of most star-forming clouds within the Milky Way. On the other hand, molecular line observations of these clouds obtained with some of the most powerful (sub-)millimeter telescopes around the world such as ALMA and NOEMA provide a detailed view of these clouds at arcsecond resolution (e.g [3]). However, at the moment, deriving H2 column density images from such observations is very time consuming as it requires the modelling of every single spectrum, each ALMA/NOEMA observations typically holding several tens of thousands of them. In order to speed up this process, the student will use machine learning, extending the analysis we published in [4]. In that study we showed how one can predict H2 column densities from molecular line emission using the Random Forest algorithm.  

The student will use the Herschel mass surface density and dust temperature images of Galactic star-forming clouds obtained at 18” resolution alongside ALMA integrated N2H+(1-0) intensity and dust continuum images at ~1” resolution as an input to the Random Forest algorithm. N2H+(1-0) is the main contributor to reproducing the high-density parts (Av>10) of the H2 column densities [4], so the proposed combination of data will allow us to accurately recover the H2 column density images of our cloud sample at the same angular resolution as our ALMA/NOEMA data. For this project to work, we require a training dataset. For this, we will use single-dish large-scale N2H+ mapping of nearby star-forming clouds recently performed within two IRAM 30m large programmes, ORIONB (PI: J. Pety) and LEGO (PI: J. Kauffmann). The physical resolution of these observations (27” at 400pc -> ~0.05pc) are very similar to those of our clump sample (1” at 4kpc -> ~0.03pc). If needed, we will also use synthetic observations obtained from numerical simulations produced locally in Cardiff (by Dr P. Clark) to train the algorithm. The outcome of this PhD project could be a real game-changer in the field of Galactic star formation.

Start date: 1st October 2022

The UKRI CDT in Artificial Intelligence, Machine Learning and Advanced Computing provides 4-year, fully funded PhD opportunities across broad research themes:

  • T1: data from large science facilities (particle physics, astronomy, cosmology)
  • T2: biological, health and clinical sciences (medical imaging, electronic health records, bioinformatics)
  • T3: novel mathematical, physical, and computer science approaches (data, hardware, software, algorithms)

 Its partner institutions are Swansea University (lead institution), Aberystwyth University, Bangor University, University of Bristol and Cardiff University.

Training in AI, high-performance computing (HPC) and high-performance data analytics (HPDA) plays an essential role, as does engagement with external partners, which include large international companies, locally based start-ups and SMEs, and government and Research Council partners. Training will be delivered via cohort activities across the partner institutions.

Positions are funded for 4 years, including 6-month placements with the external partners. The CDT will recruit 10 positions in 2022.

The partners include: We Predict, ATOS, DSTL, Mobileum, GCHQ, EDF, Amplyfi, DiRAC, Agxio, STFC, NVIDIA, Oracle, QinetiQ, Intel, IBM, Microsoft, Quantum Foundry, Dwr Cymru, TWI and many more.

More information, and a description of research projects, can be found at the UKRI CDT in Artificial Intelligence, Machine Learning & Advanced Computing website.

How to apply:

To apply, and for further details please visit the CDT website and follow the instructions to apply online.

This includes an online application for this project at (with a start date of 1st October 2022):

Applicants should submit an application for postgraduate study via the Cardiff University webpages including:

• your academic CV

• a personal statement/covering letter

• two references, at least one of which should be academic

• Your degree certificates and transcripts to date.

In the "Research Proposal" section of your application, please specify the project title and supervisors of this project.

In the funding section, please select that you will not be self funding and write that the source of funding will be “AIMLAC CDT”

The deadline for applications for the UKRI CDT Scholarship in Artificial Intelligence, Machine Learning and Advanced Computing (AIMLAC) is 12th February 2022. However, AIMLAC will continue to accept applications until the positions are filled.

For general enquiries, please contact Rhian Melita Morris [Email Address Removed]


The typical academic requirement is a minimum of a 2:1 physics and astronomy or a relevant discipline.

Applicants whose first language is not English are normally expected to meet the minimum University requirements (e.g. 6.5 IELTS) (

Candidates should be interested in AI and big data challenges, and in (at least) one of the three research themes. You should have an aptitude and ability in computational thinking and methods (as evidenced by a degree in physics and astronomy, medical science, computer science, or mathematics, for instance) including the ability to write software (or willingness to learn it).

Funding Notes

The UK Research and Innovation (UKRI) fully-funded scholarships cover the full cost of 4 years tuition fees, a UKRI standard stipend of currently £15,921 per annum and additional funding for training, research and conference expenses. The scholarships are open to UK and international candidates.


[1] Kennicutt, 1998, ApJ, 498, 541
[2] Andre et al, 2010, A&A, 518, 102
[3] Peretto et al., 2013, A&A, 555, 112
[4] Gratier et al. 2021, A&A, 645, 27

How good is research at Cardiff University in Physics?

Research output data provided by the Research Excellence Framework (REF)

Click here to see the results for all UK universities
Search Suggestions
Search suggestions

Based on your current searches we recommend the following search filters.

PhD saved successfully
View saved PhDs