SIParCS 2022 - Joe Ammatelli
Increasing the portability and reproducibility of a scientific application using containers and Spack
Reproducibility underpins strong science and useful software. Yet, due to tens or hundreds of build dependencies and the need for system specific configuration, building and deploying scientific applications on different platforms for testing or analysis can be challenging. Scientists and software developers alike want to spend less time configuring and more time synthesizing important insights and implementing new tools. In the last decade, containers, which allow applications to be seamlessly deployed on different platforms, have been widely integrated into software pipelines. In this project, we investigate whether constructing containerized software environments with Spack, a package manager for scientific software, can enable both efficient and portable software builds for scientific applications. We develop a suite of containers customized for running Samurai, a program with known build complications that converts airborne radar data into a wind field, on both CPUs and GPUs. Beginning with developer containers equipped with Spack and the software necessary to build Samurai, we generate lightweight containers including only the program executable and linked libraries. Evaluating the lightweight containers on the Cheyenne and Casper computing clusters at the National Center for Atmospheric Research (NCAR), we demonstrate that the containerized application requires little configuration when being deployed, retains program correctness, and provides competitive performance compared to the bare-metal equivalent on both CPUs and GPUs. Future efforts are focussed on containerizing multiple performance-constrained applications in the same container and contributing containers to the E4S project to accelerate the transition to exascale computing.
Mentors: Jian Sun, Brian Vanderwende, John Dennis
Slides and poster