Efficient post-processing of climate and weather data is key for the data analysis. At the moment, scientists use toolkits from Python like Pangeo and command line tools like CDO. The command line tools suffer often from limited parallelism and Python tools are not suitable for on-line data processing and the integration of data analytics via artificial intelligence is lacking and inefficient.
Goal of this thesis is to develop and realize concepts and improved tool(s) that enable efficient post-processing of huge data volumes for climate/weather in nearline.
This encompasses 1) the node-local efficient processing via GPUs, 2) concepts for scalable processing of massive data volumes in a cluster that at best can be run concurrently with applications (in-transit processing), 3) the connection of AI analytics into the workflow. The work will be embedded in the ACES research group and conducted in tight collaboration with NVIDIA along the research project ESiWACE2. It will be integrated into a bigger vision for future storage and compute interfaces that supports scientists from climate and weather but also other domain scientists.