SIParCS 2022 - Emma Marshall
Demonstrating cloud-based remote sensing data workflows with xarray
Recent advances in satellite imagery availability, cloud-computing resources and open-source software represent exciting developments in earth science and climate research. Access to computing resources and storage are significant bottlenecks and barriers to participation that hinder efforts to undo historical legacies of exclusion in the sciences. Transitioning to cloud-based workflows has the potential to drastically increase efficiency and broaden scientific participation, two important objectives on the path to understanding and preparing mitigation strategies for a changing climate. With evolving computational tools comes the associated need for detailed, accessible documentation and educational resources to increase usership of these tools and datasets. This work presents relevant open-source contributions, focusing on Jupyter Book tutorials. The tutorials demonstrate various steps of accessing and interacting with cloud-hosted remote sensing datasets on platforms such as Amazon Web Services (AWS) and Microsoft Planetary Computer using the open-source python package xarray. The tutorials were developed with an emphasis on accessible, explanatory text that includes solutions to commonly-encountered errors and step-by-step descriptions of xarray functionality as well as ways to incorporate xarray tools to improve common scientific pipelines. Scaling educational resources related to remote sensing datasets, cloud-computing resources and scientific data analysis is critical in order to leverage the potential of these resources and realize their benefit. This work seeks to make a small contribution toward that goal and lay a framework through which future examples may be developed.
Mentors: Deepak Cherian, Scott Henderson, Jessica Scheick, Kevin Paul
Slides and poster