CISL Annual Report: FY2022
The NCAR Computational & Information Systems Lab achieved a number of significant milestones in fiscal year 2022. In the process, CISL made noteworthy advances in high-performance computing, furthered its substantial progress in computational science research and development, and met key goals in enhancing data science and services.
Throughout the year, CISL staff dealt with the ongoing impact of the COVID 19 pandemic, particularly the delays in manufacturing and delivery of the anticipated Derecho supercomputing system. By the end of FY22, most staff had settled into newly adopted flexible work arrangements, and preparations for the new system continued in anticipation of its delivery in December 2022.
Report contents
Delivering Diverse and Cutting-Edge HPC Environments
Preparing for Exascale Computing
Data Repositories and Services
Advancing Data Science
Outreach, Diversity and Education
Delivering Diverse and Cutting-Edge HPC Environments
Early in FY22, CISL was deep into preparing the NCAR and university communities to use the extensive GPU-computing capabilities of the new Derecho system. In February 2022, the lab presented the first in a series of 16 workshops and tutorials developed for that purpose. Subsequent sessions were held every few weeks and into September, with 200 individuals who registered for the series. Recordings and slides continue to be available for anyone to review.
In April, soon after the launch of the GPU training series, CISL took delivery of a test system known as Gust, a prototype used to mimic the new Derecho HPE Cray EX cluster’s hardware, software, user environment, and job execution configuration. Gust features include both CPU and GPU nodes, a small Lustre scratch file system, the Cray Programming Environment, traditional compilers and libraries provided by CISL, and the PBS job scheduler.
By mid-summer, CISL staff had readied Gust for early users to begin testing and preparing their codes and models to run in the new environment. Among the early users, priority was given to selected Accelerated Scientific Discovery (ASD) projects. These included six projects proposed by university researchers and nine led by personnel from six NCAR labs.
The university projects approved by the CISL HPC Allocations Panel cover science domains including climate, hydrology, fluid dynamics, and magnetospheric physics. NCAR projects were chosen for their strategic significance and suitability for the Derecho system. The NCAR project science domains range from atmospheric chemistry, meteorology and climate to fluid dynamics, machine learning, oceanography, paleoclimate and solar physics.
Additional preparations for the Derecho era focused on the CISL Systems Accounting Manager (SAM) and the Advanced Research Computing (ARC) web portal. For example, new features added to SAM accommodate a projected threefold increase in computing activity, account more accurately for dynamic resource allocation, and support the more extensive use of GPUs. And the ARC portal also now provides access to HPC documentation, CISL’s Daily Bulletin, and a new interface for users submitting allocation requests.
At the end of FY22, supply chain delays and rulings by the Defense Priorities and Allocations System meant Derecho delivery was not expected until late 2022 and acceptance anticipated in May 2023. ASD projects could be conducted from then through June 2023, at which point the system would be available to the general user community.
Those delays in delivery of the new system resulted in a commitment to supporting continued use of the Cheyenne supercomputer, which became operational at the beginning of 2017. Throughout FY22, Cheyenne and the Casper analysis and visualization system were heavily used and remarkably reliable.
- Cheyenne supported more than 1,900 unique users at more than 300 universities and other institutions. Nearly 1.4 billion core-hours were delivered to more than 850 projects on Cheyenne and Casper.
- Cheyenne’s reliability and usage continued to climb to all-time highs despite the effects of the pandemic. Cheyenne averaged over 98.7% user availability and 85.4% average daily user utilization. Casper averaged over 97.4% availability and 53.6% user utilization.
- CISL’s annual survey of users to collect information about new publications based on the work done on NCAR HPC systems received a total of 343 responses. Users reported 459 new peer-reviewed publications in FY22 along with 154 other publications and 65 dissertations. Responses to survey questions about various services that CISL provides showed an overall increase in satisfaction from 4.73 in FY21 to 4.75 in FY22 (on a scale of 1 to 5).
Preparing for Exascale Computing
CISL continued its commitment to pursuing applied computational and data science research and creating highly scalable, performant, portable applications and data analysis tools for emerging exascale computing architectures. Staff reached several milestones for developing methods and tools to accelerate the pace of code optimization and porting, enabling NCAR applications to exploit new technologies such as GPUs and machine learning.
Notably in FY22, the Geoscience Community Analysis Toolkit (GeoCAT) team made excellent progress on Project Raijin, an NSF-awarded effort to develop community-owned, sustainable, scalable tools for operating on unstructured grids employed by next-generation climate and global weather models. The team formed a partnership with the Department of Energy (DOE) Simplifying ESM Analysis Through Standards (SEATS) project to co-develop UXarray, a Python package that extends Xarray with unstructured grid support. The partnership with SEATS effectively doubled the number of software engineers working on UXarray and helped ensure that the package will meet the needs of the DOE and NSF modeling communities. Monthly releases of UXarray have been made available on Anaconda. The highest number of downloads per version was more than 800, and all-time downloads exceeded 4,500 in the same period.
Other achievements in FY22:
- A GPU-enabled science discovery capability for the MURaM magnetohydrodynamics model – the first unified CPU/GPU version of the MURaM code – was fully ported and will be released in the spring of 2023.
- As part of the NSF-funded EarthWorks project, CISL completed the GPU port of the PUMAS/MG3 microphysics package within the Community Atmosphere Model (CAM) within the Community Earth System Model (CESM).
- Also as part of the EarthWorks project, CISL helped complete the port of MPAS-7 onto GPUs.
- CISL led a coaching session for NCAR’s System for Integrated Modeling of the Atmosphere (SIMA) technology review and has provided recommendations to help improve processes within the project.
- CISL developed and tested a new Earth Computing Hyperparameter Optimization Python package for performing a distributed search across multiple CPU or GPU nodes for optimal machine learning hyperparameters. The first version was nearing completion at the end of FY2022 and has been tested by collaborators and visitors on a broad range of problems.
- CISL staff developed machine learning emulators for surface layer fluxes and the GECKO-A model for volatile organic compound reactions. The surface layer model was run successfully in the FastEddy GPU LES model and in WRF. The GECKO-A emulator ran successfully in box mode and was tested in the GEOS-Chem model.
- CISL staff collaborated with both the NCAR Earth Observing Laboratory and NCAR’s Mesoscale and Microscale Meteorology Laboratory to develop a deep learning algorithm for identifying the locations and sizes of particles in the HOLODEC airborne cloud particle imager. The model successfully transitioned from synthetic training data to real-world holograms.
- The four-day Trustworthy Artificial Intelligence for Environmental Science Summer School drew 734 registered participants. Daily attendance ranged from 218 to 497 individuals focused on how to develop trustworthy artificial intelligence for Earth system and environmental sciences.
- Creation of a GPU-enabled Lagrangian particle microphysics capability in Cloud Model 1, which will be used by a university-led ASD project in the spring of 2023.
Data Repositories and Services
Developing and enhancing data repository software and delivering world-class data services for Earth system science continued to be a CISL priority. Staff provided these services reliably while adding new capabilities based on stakeholder input and adopting new technology. CISL’s infrastructure and software now enable users to extract only what they need from multi-terabyte data sets by defining and retrieving data products that meet their specifications for parameters, temporal and spatial domains, and output data format.
The Research Data Archive (RDA) delivered more than 7.9 PB of data to 14,300 users in FY22. Nine new data set collections were added and 75 new digital object identifiers (DOIs) were assigned to enable formalized citation capabilities for RDA collections. CISL staff continued harvesting and compiling available RDA data citation counts based on formal citations that use RDA DOIs. The effort identified 508 peer-reviewed articles or books that cited RDA data sets from October 2021 to September 2022, bringing the total citation count to 1,424.
CISL provides additional support through remote access capabilities, including Jupyter Notebook downloads, and visualization capabilities for selected gridded data sets in addition to the access capabilities provided by the THREDDS data server, Globus transfer services, and HPC subsetting services.
Also in FY22, CISL launched the new Geoscience Data Exchange (GDEX) in January, completing an expansion and rebranding of the former DASH Repository. The GDEX transition was a response to the need to provide open access as required by federal policy, funding agencies, and scientific publishers, and to support scientific reproducibility and secondary use of data for new research. The GDEX team was preparing to apply to Core Trust Seal for Trusted Data Repository certification early in FY23.
Other CISL highlights in FY22 included:
- Successful migration of Community Climate System Model and Community Earth System Model outputs from the tape-based High-Performance Storage System (HPSS) to the NCAR Campaign Storage file system and GLADE file spaces. HPSS access infrastructure was removed from the code base and deployment environment.
- CISL also submitted a System Use Rate proposal to NSF to facilitate archiving of non-NCAR data.
- Migration of Research Data Archive ERA-5 holdings to Stratus, the CISL object storage disk system for long-term storage.
Advancing Data Science
CISL plays a pivotal role in supporting NCAR’s strategic objective of improving predictions of weather and climate and better estimating their impact. We do this by advancing development of data assimilation tools; enhancing visualization, analysis, and augmented-reality tools for geophysical data; and improving the efficiency of application workflows.
In partnership with NCAR’s Climate and Global Dynamics Laboratory, CISL helped spearhead the creation of the Earth System Data Science (ESDS) Initiative. The vision of this grassroots effort is to increase the effectiveness of the NCAR/UCAR scientific workforce and prepare for open science by promoting deeper collaboration on data analysis between scientific and computing staff. Working closely with Project Pythia, ESDS has helped foster communities of practice around activities such as weekly office hours, bi-weekly presentations and open-topic forums, and vibrant discussion forums. Topic focus areas include scalable workflows, interactive analysis with the scientific Python ecosystem, and cloud computing. This initiative has grown significantly and now includes representatives of other NCAR labs on the steering committee.
Other FY22 highlights:
- Version 3.7 of the Visualization and Analysis Platform for Ocean, Atmosphere, and Solar Researchers (VAPOR) was released in FY22. This brought new capabilities to the geoscience community by incorporating a long-awaited Python scripting API enhancing Lagrangian particle visualization, and parallelizing unstructured grid code in support for kilometer-scale climate and global weather models. VAPOR was downloaded 3,783 times and cited 22 times in research publications. Five virtual tutorials attracted 276 registered participants, while educational materials on the VAPOR YouTube channel had 4,176 views.
- The Geoscience Community Analysis Toolkit (GeoCAT) team reached new milestones for downloads while also increasing the size of its user base. GeoCAT-comp, the computational component of GeoCAT, received an average of around 1,100 page visits per month. It reached a high of about 2,100 downloads per version – a likely indicator of individual users – through the Anaconda package management system and Python Package Index (PyPI). It also exceeded 23,000 all-time downloads through Anaconda and PyPI. The GeoCAT plotting gallery – GeoCAT-examples along with the GeoCAT-viz convenience library – received about 4,900 monthly page visits. All-time gallery downloads exceeded 20,000. WRF-Python, the GeoCAT WRF diagnostics toolkit, received about 10,500 monthly page visits, and its highest number of downloads per version was more than 1,900 through Anaconda.
- Several improvements were made to the Large Data Comparison for Python utility. These included enhanced handling of 3D data sets to enable better understanding of the differences between vertical levels, Fourier decompositions, and several entropy-related metrics. Experimenting with mutual information content as a predictor of compression level for Community Atmosphere Model (CAM) data resulted in a paper accepted to an SC22 workshop. Collaboration with CAM scientists on lossily compressed sample data and relevant analyses is ongoing.
- Completion of Data Assimilation Research Testbed (DART) testing of bounded prior rank histogram distributions for observations with the CESM CICE model.
- CISL updated the WRF-DART prediction system for the Red Sea region in order to obtain a more accurate representation of that ecosystem's variability. This entailed interfacing the biogeochemical component of the MIT General Circulation Model – N-BLING – with DART.
- Establishment of an informal partnership with the U.S. Department of Energy’s Regional and Global Modeling and Analysis program.
- Project Pythia presented nine remote tutorials, each of which drew about 100 participants. They covered a wide range of topics centered around using the scientific Python ecosystem for Earth system science data analysis. All of the tutorials were recorded for future viewing, and are now available from the Project Pythia Resource Gallery.
- In collaboration with the AI2ES AI Institute, CISL developed and deployed a real-time storm mode analysis tool and interactive visualization for NOAA convection-allowing models. Forecasters evaluated the tool as part of the NOAA Hazardous Weather Testbed Spring Experiment.
- Also in collaboration with AI2ES, CISL implemented evidential neural network uncertainty quantification methods for weather and climate problems to estimate aleatoric and epistemic uncertainties without the need for large ensembles of machine learning models. Evidential methods were tested on surface layer flux estimation and predicting winter precipitation type.
- CISL completed the first version of its SPERR wavelet-based, error-bounded, lossy data compressor and released it publicly on GitHub. The SPERR compressor offers superior rate-distortion performance among all leading scientific data compressors (a publication is forthcoming in FY23). In addition to releasing SPERR, CISL is working with the NCAR High Altitude Observatory to incorporate SPERR into an Advanced Scientific Discovery project using the MuRAM model.
Outreach, Diversity and Education
CISL once again demonstrated NCAR’s ongoing commitment to the education and training of early-career scientists, engineers, and technicians with the successful Summer Internships in Parallel Computational Science (SIParCS) program and other ongoing learning and outreach activities.
After two summers of virtual participation because of COVID pandemic restrictions, SIParCS returned to a primarily in-person opportunity for graduate and undergraduate students to interact with and learn from CISL staff and other mentors. CISL hosted 18 students – 11 of them from underrepresented backgrounds – for the FY22 internship program. The interns’ final presentations described their accomplishments on challenging topics that included data assimilation, Lidar integration, machine learning, and GPU computing.
Among other outreach activities, staff at the NCAR-Wyoming Supercomputing Center (NWSC) in Cheyenne were able to host tours of the facility for 170 individuals. While the NWSC remained closed to walk-in visitors because of pandemic restrictions, tours were available by appointment. Visitors included representatives of the National Science Foundation, U.S. Geological Survey, and U.S. Air Force as well as groups of NCAR and UCAR employees.
A team of CISL staff and representatives from other NCAR labs worked with vendors throughout the fiscal year on an extensive overhaul of the NWSC Visitor Center. All new content was created and approved to introduce “The Story of Data” as a new theme for showing visitors how NCAR uses data, from collection through processing and visualization. Installation began in November and will be completed by the end of the calendar year.