SIParCS 2024 - Anh Pham
DART-X: Software Infrastructure for Prototyping in-memory Data Transfer between Ensemble Data Assimilation and Coupled Earth Systems Models
Data assimilation (DA) is a powerful tool to integrate real-world data into coupled climate models, capturing the complexity of the Earth system and improving prediction accuracy. NCAR’s Data Assimilation Research Testbed (DART) is an ensemble DA tool used for climate predictions with NSF NCAR’s Community Earth System Model (CESM). Traditionally, DART has relied on modifying "restart" files written to disk to influence numerical models, which involves significant I/O and model stop/restart processes, incurring high computational costs on high-dimensional models, even on large supercomputers. The goal of this project is to explore the feasibility of building an interface (a “cap”) between DART and CESM using the National Unified Operational Prediction Capability (NUOPC) layer to enable direct in-memory data transfer between DART and CESM, avoiding disk I/O bottlenecks.
This project establishes a software infrastructure to develop and test the NUOPC cap for DART, enabling full software utilization and integrating the cap into a working system. The challenge is to build a new interface with minimal disruptions to existing features and resolve incompatibilities between different software with extensive codebases. I will discuss the decisions made on tools, frameworks, and workflow optimization, including midstream adaptations during the simultaneous development of the cap code. I will also describe practices for resolving conflicts using the build system. Employing this infrastructure facilitates the first prototype for in-memory data transfer between DART and a coupled Earth system model, exploring potential enhanced efficiency and scalability in data assimilation.
Mentors: Helen Kershaw, Dan Amrhein, Ufuk Turuncoglu
Slides and poster