A | B | C | D | E | F | G | H | I | J | K | L | M | N | O | P | Q | R | S | T | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
1 | Speaker(s) | Title of Presentation | Abstract | |||||||||||||||||
2 | Shahzeb Siddiqui | Spack Infrastructure at NERSC | In this talk, we will present an overview of spack infrastructure at NERSC used for software stack deployment. We leverage GitLab to perform full source rebuilds of the stack on our test system followed by production deployment on Perlmutter and Cori. We leverage Gitlab scheduled pipelines to perform software stack rebuilds to ensure our stack can be reproduced and not affected by underlying system changes. Our software stack is based on the Extreme-Scale Scientific Software Stack (https://e4s.readthedocs.io/) which is composed of 80+ scientific software packages which is a subset of spack packages built and tested together. We will discuss the software deployment process for deploying E4S stack on NERSC systems. Since Oct 2020, we have been deploying E4S on quarterly basis with three deployments on Cori (e4s/20.10, e4s/21.02, e4s/21.05) and one deployment on Perlmutter (e4s/21.11). We will discuss some lessons learned during this software deployment process. | |||||||||||||||||
3 | Rajesh Kalyanam, Carol Song | GeoEDF: A Workflow Framework for Composing and Executing Geospatial Research Workflows | Geospatial research workflows often involve data retrieval from various data sources using different protocols, data inspecting, filtering, and preprocessing to prepare the data for use with researchers’ own domain-specific code, and finally analysis or simulation carried out on HPC resources. In practice these steps may comprise various scripts, custom-built code, and publicly available libraries that are then carried out on a mix of desktop tools, web-based science gateways, and HPC resources. Consequently, researchers spend an inordinate amount of time wrangling and transferring data between these locations, with the overall reproducibility of the workflow being impacted. The GeoEDF workflow framework addresses these issues, providing a framework for developing reusable building blocks that implement individual workflow steps, and a template for composing these building blocks into end-to-end research workflows that execute entirely on HPC resources. GeoEDF workflows are conceived as an abstract sequence of data acquisition and processing steps specified in the YAML format. The actual data acquisition and geospatial processing operations are implemented in open-source, reusable Pythonic data “connectors” and “processors”. A Python-based GeoEDF workflow engine transforms a YAML GeoEDF workflow into executions of corresponding connectors and processors on a HPC system, automating data transfers between steps. The GeoEDF workflow engine is currently deployed in the Jupyter interactive computing environment on the publicly accessible MyGeoHub science gateway. Users can instantiate, execute, and monitor the status of a workflow from this Jupyter environment. The data connectors and processors are managed in GitHub where a CI/CD process transforms community contributed connectors and processors into Singularity containers that can be executed on HPC resources. In this talk, we will describe the implementation of GeoEDF and demonstrate the use of the GeoEDF workflow engine on MyGeoHub via a proof-of-concept hydrology workflow that retrieves MODIS data for a certain date range and watershed (in ESRI Shapefile format) as HDF files from a NASA Distributed Active Archive Center (DAAC) and returns a new watershed shapefile with per-polygon averages for the user’s choice of HDF variables. | |||||||||||||||||
4 | Joseph Schoonover | Continuous Benchmarking Practices applied to the Spectral Element Library in Fortran | This talk will cover the basics of how Fluid Numerics approaches continuous integration and continuous benchmarking for an OO Fortran library, the Spectral Element Library in Fortran (SELF). As a cloud-native organization that is cost conscious, we have developed a practice that focuses on building a database of benchmark runtimes as a function of compiler, compiler flags, floating point precision, and target hardware. By leveraging Cloud Build, a Slurm-based cluster deployment, and a bespoke build step called fluid-run, benchmarking has become part of our automatic testing that naturally stockpiles information on application performance over time. To visualize and explore benchmarking data, I’ll discuss dashboarding options including the free web-based DataStudio solution from Google and various views that help guide recommendations for ideal hardware choices and compiler options. Because testing incurs additional cpu-hour and gpu-hour expenses, I’ll also cover strategies for selectively running benchmarks through slash commands in git commit messages. | |||||||||||||||||
5 | Patrick Stegmann | Modern Software Engineering Tools and Practices for the CRTM | The Community Radiative Transfer Model (CRTM) is a comprehensive all-sky radiative transfer model for satellite data assimilation and remote sensing applications, developed and maintained by the Joint Center for Satellite Data Assimilation. As such, it is an integral part of NCAR’s Gridpoint Statistical Interpolation (GSI) system and the more recent Joint Effort for Data Assimilation Integration (JEDI) framework of the Joint Center. Development of the CRTM began in 2004, with an early emphasis on modular design and community-driven improvement based on earlier roots in the OPTRAN model from 1977. Over the course of its history, the CRTM has developed into a large and complex Fortran library with elements from all recent Fortran standards and supporting scripts in IDL, Ruby, and Python. Its mission, to integrate the contributions from a diverse community of engineers, scientists, and others into a working whole remains a success story. With the advent of the C++-based JEDI framework, the modernization effort for the CRTM has intensified with the goal of improving collaboration, developer productivity, model flexibility and operational speed. This has led to a series of rapid changes from v2.3 to v2.4.1 for the CRTM, among which the major ones are discussed in this presentation: • Aligning itself with the JEDI framework, the CRTM has transitioned seamlessly from its original subversion-based version control to a git repository hosted on Github, preserving the development history in the process. • Project management of the agile development process is enabled via the Zenhub platform. Progress is tracked via an online Scrum board, a global roadmap, and various performance metrics. • Continuous Integration (CI) is now available and enforced, first based on the Travis CI service and later based on the AWS cloud. • The complex autoconf legacy build system has effectively been superseded by the ecbuild extension of cmake, which is maintained by the ECMWF, and will eventually be phased out. This change in particular has simplified maintaining the CRTM on various different platforms and with a number of new dependencies. • A long-requested feature, the look-up tables of the CRTM can now be used in both netCDF format and the legacy binary format. This provides more flexibility and transparency for the CRTM users. • OpenMP parallelization was implemented for the Forward model and the model Jacobians. | |||||||||||||||||
6 | Scott Pearse | Visualization Software: VAPOR's workflow for supporting the scientific community | VAPOR is a desktop application for 3D visualization of simulation data. In the geophysical sciences, visualization is often done with 2D slices through a volume of simulated data. This can be effective but may omit data outside of the slice. This omitted data can hold important observables in a given simulation, but including it has challenges. In this talk, I’ll discuss what the challenges are with 3D visualization, and what VAPOR’s mission is for reducing these challenges. I’ll give a basic demonstration on how VAPOR is used from installation to producing a scientific visualization. After that I’ll give an overview on how our team operates during release cycles, including notes on how we prioritize bugs, features, and community engagement. Finally I’ll discuss our continuous integration systems that we use to produce installers and perform testing on our repository. | |||||||||||||||||
7 | Timothy Brown | Scaling NWP workloads on AWS to achieve your research goals | The use of cloud computing technologies within HPC has grown considerably over the last few years. With these advances there’s more options on how to run NWP than ever. In this talk we distill the options and show how researchers can get started in an environment they’re familiar with. We’ll discuss cluster orchestration with AWS ParallelCluster and Slurm, parallel filesystem with Amazon FSx for Lustre, high performance networking with Elastic Fabric Adapter (EFA), software management with Spack and then we’ll present scaling and cost analysis of FV3GFS and WRF across AWS HPC6a (AMD Milan), C6i (Intel Icelake) and C6g* (Arm-based AWS Graviton2) instances. | |||||||||||||||||
8 | Ben Lazarine | Assessing the Vulnerabilities of Source Code in GitHub for Scientific Cyberinfrastructure: An Artificial Intelligence Perspective | ||||||||||||||||||
9 | Stanislaw Jaroszynski | Migrating a Scientific Desktop Applications to Python | GUI desktop applications are integral to 3D data visualization and analysis. In recent years the scientific community has been moving towards Jupyter notebooks and scripted workflows to explore data and share their findings. In order to maximize their reach and usefulness as scientific tools, desktop applications such as NCAR's VAPoR (www.vapor.ucar.edu) application need to adapt and provide Python bindings to their visualization capabilities. This can prove to be a challenge for an application that entered development long before such requirements existed. This talk covers the tools used and developed to create a Python interface to the VAPoR application, the pitfalls encountered, as well as solutions to problems such as integrating OpenGL rendering in headless environments. This talk also focuses on tricks used to accomplish this effort using limited resources by automating tasks such as the migration of existing developer documentation to Python Docstrings. | |||||||||||||||||
10 | Rob Gonzalez-Pita | A GitHub Actions Self-Hosted Runner for HPC Applications | A GitHub Actions Self-Hosted Runner for HPC Applications GitHub Actions has emerged as a leading, easy-to-use, low-cost solution for implementing continuous integration and continuous deployment (CI/CD) pipelines in GitHub-served repositories. The CI/CD pipelines can run on a variety of user-actions like when a code manager clicks a button in the web interface, or when someone opens a Pull Request to the main code branch. For many types of pipelines, the default Linux runner with 2 cores, 7 Gb of RAM, and 14 Gb of disk space that is provided by GitHub is enough, but it can be insufficient for many HPC applications used for numerical weather prediction (NWP). The USF Short Range Weather (SRW) Application, for example, uses 4 cores by default to build in about 30 minutes on NOAA’s RDHPCS Hera, which is composed of Intel Skylake processors with 2.4 Gb of memory per core – a total of 9.6 Gb of memory. Like so many NWP systems, the SRW App also requires a battery of end-to-end integration tests to convince code managers that code modifications meet a “do no harm” requirement for the existing code base. To run the 80 or so tests of the end-to-end system on Hera requires several thousand core hours and nearly 2 Tb of disk space. To test the parallel computing capability of the system, many of the workflow tasks use several full 40 processor nodes to meet the needs of memory- or compute-bound processes. In this talk, we will share some of the lessons we’ve learned standing up an AWS Parallel Cluster as a self-hosted GitHub Actions runner, including security requirements, managing how and when code is tested, and the challenges associated with coordinating tests for Pull Requests contributed to multiple repositories. | |||||||||||||||||
11 | Jonathan Edelen | Implicit Optimization of Accelerator Beamlines with Machine Learning | One of the fundamental challenges of using machine-learning-based inverse models for optics tuning in particle accelerators, particularly transfer lines, is the degenerate nature of the magnet settings and beam envelope functions. Moreover, it is challenging 'if not impossible' to train a neural network to compute correct quadrupole settings from a given set of measurements due to the limited number of diagnostics available in operational beamlines. However, models that relate BPM readings to corrector settings are more forgiving, and have seen significant success as a benchmark for machine learning inverse models. We recently demonstrated that when comparing predicted corrector settings to actual corrector settings from a BPM inverse model, the model error can be related to errors in quadrupole settings. In this paper, we expand on that effort by incorporating inverse model errors as an optimization tool to correct for optics errors in a beamline. We present a toy model using a FODO lattice and then demonstrate the use of this technique for optics corrections in the AGS to RHIC transfer line at BNL. | |||||||||||||||||
12 | Edward Hartnett | Improving Documentation, Testing, Process, and Code for Legacy NOAA GRIB2 C/Fortran Libraries | NOAA has created, and maintains, several C and Fortran libraries for reading and writing data in the GRIB1 and GRIB2 formats. The code in these libraries is, in many cases, decades old, written by programmers and scientists who have moved on to other projects, or left the organization. To modernize these codes, and ensure that they continue to provide value for NOAA in the coming decades, we have improved and added documentation and unit tests, subjected the code to continuous integration testing and an agile software process, and carried out significant refactoring of the code. In this paper we describe the challenges and solutions we uncovered while bringing these legacy codes into the modern era of software development. | |||||||||||||||||
13 | George McCabe | Continuous Integration of METplus Use Cases Using GitHub Actions and Docker | METplus is a software suite used around the world to perform consistent, reproducible Numerical Weather Prediction model verification. It consists of multiple highly configurable software components. Community contributions are encouraged to extend the capabilities. However, this flexibility makes effective testing challenging. Individual software components may be sufficiently tested but changes may inadvertently affect other components. Software dependencies required for one user’s contributions may not be available in other users’ computing environments. The METplus team has implemented a complex continuous integration framework designed to address these challenges. METplus leverages GitHub Actions to trigger workflows that perform a variety of tasks. Over 100 use case examples are provided in the METplus GitHub repository. They are automatically tested to confirm that changes to the METplus components do not break or change existing functionality. A set of rules determine what should be run during an automated workflow. A subset of tests can be run when a developer pushes changes to a repository. The full suite of tests and logic to compare the output to truth data are run when a request is made to merge changes. A developer can also manually enable or disable testing components. This flexibility improves the development process while avoiding unnecessary execution of jobs. Docker integrates nicely into GitHub Actions and is used to perform many useful functions of the METplus testing workflow. The MET C++ executables are installed inside a Docker container so that the desired version of the software can be easily obtained instead of installed redundantly. Each use case corresponds to an input dataset that is made available through Docker data volumes. Output is stored on DockerHub and is used in comparison tests to alert developers of unexpected differences. Docker is used to isolate testing environments for cases that require additional software dependencies. In this talk, I will (a) outline the automated testing performed through GitHub Actions for METplus, (b) describe the critical role of Docker containers in this workflow, and (c) illustrate how this approach streamlines the development process, saves time and money, and produces more robust results. | |||||||||||||||||
14 | Sean Cleveland | Multi-Container Actor Workflows For Supporting A Climate Science Gateway in Hawaii | The reproducibility of analysis is an important factor to the credibility of research and impacts the reuse of the research outputs and products. The rise in software container technologies has helped to address some issues around reproducibility in capture and environment with many of the dependencies required to rerun an analysis How can we leverage containers in a reproducible and provenance tracked way to address workflows that must execute daily to support near-realtime analysis of climate data for the state of Hawaii? Also, how can we help support researcher development of the workflow and code evolution as new variables and data are added to the workflow? We present an approach that leverages the serverless actor API in the Tapis framework with its custom CRON-scheduling system, where parameters (environment variables) of the container can be adjusted ahead of each run, and different tasks can be handled at arbitrary intervals as needed by researchers along with built in province of each execution tracked. Combining this with Github webhooks and a layered container structure to automate and streamline the building and versioning of containers as new workflow data and products are added to the workflow coude allows researchers to more easily push updates to the production workflow. | |||||||||||||||||
15 | Miranda Mundt | A Tiered Approach to Scientific Software Quality Practices | In industry, software quality assurance rests upon decades of experience through which recommendations and best practices have been derived and applied. Scientific software, however, has not been able to directly benefit from these guidelines. As software is an integral part of modern scientific research, all scientific software benefits from software quality best practices, which can seem unachievable to small, short-term, or newly-formed research projects. We introduce the concept of a flexible, tiered approach to software quality practices that allows projects of varying sizes and maturities to adhere to appropriate, tailored practices. These practices are introduced at natural times in a project's lifecycle such that they can be integrated into a team's workflow and expand with the project. We will cover an example tiered framework we developed for the Center for Computing Research at Sandia National Laboratories, factors that influenced its design, and potential paths for future research into tiered software quality frameworks. Sandia National Laboratories is a multimission laboratory managed and operated by National Technology & Engineering Solutions of Sandia, LLC, a wholly owned subsidiary of Honeywell International Inc., for the U.S. Department of Energy’s National Nuclear Security Administration under contract DENA0003525. SAND No. SAND2022-0872 A. | |||||||||||||||||
16 | Kyle Hall, Nachiketa Acharya | XCast: A High-Performance Data Science Toolkit for the Climate Forecasting | Numerous earth science research problems and operational forecasting challenges are addressed by training statistical and machine learning models on gridded data, on a gridpoint-by-gridpoint basis. There are numerous libraries and toolkits for two-dimensional statistical modeling and machine learning in Python, but none provide support for this gridpoint-wise approach. Gridpoint-wise operations such as these are also, by nature, time consuming and computationally expensive. In order to bridge the gap between Python’s data science utilities and its earth science gridded data utilities, and overcome the high intensity of this type of computation, this Xarray-based climate forecasting toolkit implements a flexible, extensible, and highly-performant set of tools for gridpoint-wise operations. It leverages the high-performance nature of Python’s gridded data utilities, Xarray and Dask, to separate and parallelize the operations at each grid point, and bypasses Python’s global interpreter lock in order to let the user utilize their computer’s full capacity. Inspired by the PyELM-MME (Acharya and Hall,2021) Extreme Learning Machine climate forecasting platform, which was previously designed by the authors, XCast allows standard Python data science libraries like SciKit-Learn and SciPy to be efficiently applied to gridded data with minimal time investment and computational overhead. It implements machine learning methods like Random Forest, Pruning ELM, and others for both deterministic and probabilistic forecasting, and can be easily extended by the user. XCast also implements K-Fold cross-validation, probabilistic and deterministic forecast verification modules recommended by WMO, and a set of climate data-specific tools forecast metrics. Here we describe the co-development of XCast, and several case studies highlighting its flexibility, extensibility, and broad applicability to fields like Multi-Model Ensemble forecasting and teleconnection-based forecasting. Reference: Acharya N, and Hall, K.J.C (2021): PyELM-MME: A Python platform for Extreme learning machine based Multi-Model Ensemble, Proceedings of the 2021 Improving Scientific Software Conference (No. NCAR/TN-567+PROC). | |||||||||||||||||
17 | Carl Drews | Marching Cubes Without Lookup Cases: Crawl the Edge Crossings Instead | The Marching Cubes 33 algorithm (Chernyaev 1995) uses a lookup table of 33 cases to formulate a 3D isosurface of some scalar quantity. The author presents an alternative algorithm that does not use a lookup table, but instead crawls around each cube along the edges of the isosurface. The algorithm uses the central position of each edge crossing relative to the two adjacent vertices to resolve ambiguities in the configuration of the isosurface on each cube face. The paper presents diagrams showing how the isosurface is determined on each face, and is then extended to span the middle of the cube. The algorithm uses straight lines instead of the hyperbolae used in the asymptotic decider (Nielson and Hamann 1991). We expect to obtain a better isosurface by taking the crossing position into account, instead of employing arbitrary rules to resolve the ambiguities. ACOM has implemented the “crawling” version of Marching Cubes in Python. We use our Python code to derive isosurfaces of two atmospheric chemicals: ozone and wildfire smoke. ACOM’s chemical model WRF-Chem calculates forecasts of chemical weather, and the isosurfaces are generated from model output on a daily basis. These isosurfaces are expressed in KMZ 3D model files; Google Earth provides a 4D viewing platform with a time slider to animate the shapes over time. The resulting visualization makes it easier for atmospheric scientists to analyze plumes of ozone or smoke as they evolve. The 4D visualization can inform where to construct more detailed cross sections, and can guide flight planning for research flights. WRF-Chem poses some challenges for generating and rendering the 3D isosurfaces. The ”cubes” near the surface are actually thin rectangles, and the model coordinates follow the terrain and become tilted over mountain ranges. The paper explains how to solve these problems using cube coordinates for each model cell. | |||||||||||||||||
18 | Janine Aquino | War Stories and Battle Scars: Lessons Learned Rescuing Legacy Code and How to Bolster Code to Withstand the Test of Time | When a colleague retired in 2016, I was asked if I would be willing to step in and finish a project to update some legacy code to a modern language and implement modern coding practices. “Sure”, I said, “How big of a project is it?” “About 12 weeks” came the reply. I was so naive. Nearly 6 years later the project is well defined, there is a clear path forward and we are about 40% of the way to a completed project. In this talk I will share what I learned about determining project scope, teaching myself a ‘classic’ language, and how code today can implement tools and practices to help the poor soul who has to update it after we are all long retired. | |||||||||||||||||
19 | Guilherme Castelao, Luiz Irber | Gibbs Seawater Toolbox implemented in Rust for efficienty and robustness | The Gibbs Seawater Toolbox (GSW) is a key software for Oceanography since it provides consistent thermodynamic properties of seawater, conversions, and other utilities. GSW has been adopted since 2009 by the Intergovernmental Oceanographic Commission as the official description of seawater. Although it is available on several computer languages, most implementations, such as Python, Julia, and R, are wrappers around the C library (GSW-C). Here we introduce a version of GSW implemented in pure Rust (GSW-rs), initially developed for inclusion in microcontroller firmware to support autonomous decisions and onboard Machine Learning. The same implementation also works on regular computers and can seamlessly replace GSW-C on apps and libraries by maintaining compatibility with the GSW-C Foreign Function Interface (FFI). Thanks to zero-cost abstraction, GSW-rs does not impose performance and readability trade-off, allowing it to be written for clear understanding and closer to the original scientific publications. Therefore, it is easier to verify and maintain. Another key aspect is the support for testing. GSW-rs is subject to unit-tests as well as validation against the reference dataset from TEOS-10, allowing for a consistent development through continuous integration. Modern oceanography strongly relies on autonomous platforms - such as Argo floats, Spray underwater gliders, and Saildrones - to provide sustained observations. Software robustness and performance are critical requirements for these platforms to operate with low energy budgets and up to several years in a single deployment, making Rust an optimal language for this task. At the same time, the expanding cloud infrastructure can give the illusion of infinite computing, but convenient program languages such as Python must rely on high performance languages to optimize some bottlenecks. A Rust implementation of GSW allows a sustainable and efficient progress, from embedded to high performance computing. | |||||||||||||||||
20 | John Dennis | Enabling Execution on a Graphics Processing Unit of an Idealized Atmospheric Model | The Cloud Model version 1 (CM1), which has much of the same physics found in more complex weather prediction models like the Weather Research and Forecasting Model (WRF), is a simplified model that can be used to perform idealized studies of the earth’s atmosphere. We utilize OpenACC directives to enable the execution of CM1 on Graphics Processing Units (GPU). We describe the necessary code transformations and the resulting performance on a scientifically relevant configuration. We discover that the use of a single NVIDIA v100 GPU enables a factor of 4x speedup versus the Intel Broadwell-based node used in the NCAR Cheyenne system. | |||||||||||||||||
21 | Mark Coletti, Ada Sedova | Experiences and best practices for designing effective exascale scientific computing workflows | Numerous fields within scientific computing have embraced advances in big-data analysis and machine learning, which often requires the deployment of large, distributed and complicated workflows that may combine training neural networks, performing simulations, running inference, and performing database queries and data analysis in asynchronous, parallel and pipelined execution frameworks. Such a shift has brought into focus the need for scalable, efficient workflow management solutions with reproducibility, error and provenance handling, traceability, and checkpoint-restart capabilities, among other needs. Here, we discuss challenges and best-practices for deploying exascale-generation computational science workflows on resources at the Oak Ridge Leadership Computing Facility (OLCF). We present our experiences with large-scale deployment of distributed workflows on the Summit supercomputer, including for bioinformatics and computational biophysics, materials science, and deep-learner model optimization. We address problems and solutions arising from the need to perform error handling, provenance tracing, and checkpoint-restart on output datasets created by tens of thousands of workers each producing large numbers of output files in a dataflow execution regime. We also present problems and solutions created by working within a Python-centric software base on traditional HPC systems, and discuss steps that will be required before the convergence of HPC, AI, and data science can be fully realized. Our results point to a wealth of exciting new possibilities for harnessing this convergence to tackle new scientific challenges. | |||||||||||||||||
22 | Negin Sobhani | Interactive visualizations and analysis for CTSM simulations at NEON sites | Modern Earth-system modeling research uses a variety of complex observational data with ever more sophisticated computational models to model the planet. Complex observational data as well as the complicated software and setup required for running earth-system models present a major hurdle for experimenting with earth-system models for novice users. An NSF-funded initiative between NEON, the National Ecological Observatory Network, and NCAR, the National Center for Atmospheric Research, is taking on this challenge through providing user-friendly tools and infrastructures for running Community Terrestrial System Model (CTSM) simulations and evaluating them using NEON data. As a part of the NCAR-NEON collaboration, we have developed a new Python-based interactive visualization platform that enables users to explore and interact with model outputs and observations on-the-fly. This tool allows users to generate publication-quality graphs and conduct statistical analyses comparing CTSM simulations and observational data over NEON sites without the need to download the observational data or to run the model. This dashboard was developed using a scientific Python stack, including Xarray, Bokeh, and Holoviews. Users can select from NEON sites, variables, and output frequencies to visualize using a graphical user interface (GUI). The tool offers different types of interactive visualizations and statistical summary based on users’ selections. In addition, the dashboard provides a statistical summary based on the user’s selection. The web-based dashboard does not require specialist knowledge to operate; therefore, it can be used for educational outreach activities and in classrooms. | |||||||||||||||||
23 | Dmitry Pekurovsky | Progress in a computation overlap framework in P3DFFT++, a portable open source library for Fast Fourier Transforms | P3DFFT++ is a highly adaptable open source framework for multidimensional Fast Fourier Transforms and related spectral algorithms. This class of algorithms covers a large number of computational domains. Spectral algorithms are quite difficult to scale on large HPC systems. Hiding communication latency, at least partially, is one way to reduce time to solution, and is one of the goals of this work. In addition, this work aims to implement the algorithms so they can run on heterogeneous platforms with GPUs, and do so in a highly portable, user-friendly manner. In this context we implement and evaluate overlapping computation with data transfer to/from GPU as well as GPU-directed communication techniques. Eventually the framework will have a unified API for running on CPUs and GPUs. | |||||||||||||||||
24 | Victor Eijkhout, Amit Ruhela | TUTORIAL: An Introduction to Advanced Features in MPI | TUTORIAL OBJECTIVES To discuss a number of MPI-3 and MPI-4 features that offer more flexibility, a more elegant expression of algorithms, and higher performance including the following: -One-sided communications with atomic operations -Shared memory between MPI processes on a node -Non-blocking collectives -Graph topologies -MPI Tools Interface | |||||||||||||||||
25 | Reed Milewicz, Miranda Mundt, Elaine Raybourn | Productivity and Sustainability Improvement Planning: A Lightweight Method for Improving Your Software Practices | This tutorial session will provide an overview of the Productivity and Sustainability Improvement Planning (PSIP) process: a lightweight, iterative workflow where teams identify their most urgent software development and sustainability bottlenecks and track progress on work to overcome them. PSIP captures the tacit, more subjective aspects of team collaboration, workflow planning, and progress tracking. In the potential absence of planning and development processes, and as scientific software teams scale to larger, more diverse, aggregate teams, unforeseen disruptions or inefficiencies can often impede productivity and innovation. PSIP is designed to bootstrap aggregate team capabilities into best practices, introduce the application of appropriate resources, and encourage teams to adopt a culture of process improvement. This tutorial will include a hands-on session where participants are given the opportunity to reflect on existing practices in their current projects (by taking an online self-assessment) and create a plan for improvement. These plans will come in the form of Progress Tracking Cards (PTCs) which will be adapted from real-world examples from Exascale Computing Project teams and the PSIP PTC catalog. The outcome of this session will provide concrete steps that teams can take to improve their development practices through the use of reflection (project self-assessment) and PTCs. Participants will learn skills needed to approach their teams, collaborators, and end users to precipitate process improvement. Sandia National Laboratories is a multimission laboratory managed and operated by National Technology & Engineering Solutions of Sandia, LLC, a wholly owned subsidiary of Honeywell International Inc., for the U.S. Department of Energy's National Nuclear Security Administration under contract DE-NA0003525. SAND2022-0870 A. | |||||||||||||||||
26 | Joseph Schoonover, Guy Thorsby | TUTORIAL: Leveraging Google Cloud for Continuous Benchmarking | Often, developers want to measure bulk runtime of their software for specific configurations and with specific input decks. Additionally, it is useful to understand how the runtime varies as a function of the compiler, compiler flags, and underlying hardware. For many teams, benchmarking is a manual practice that is ripe for automation by adding this as a step to continuous integration processes. However, for many scientific applications, testing environments provided through publicly available Git repositories do not provide sufficient compute resources for benchmarking at scale. By integrating with a public cloud provider, like Google Cloud, benchmarking can be scheduled to run on GPU accelerated and multi-VM clusters. In this tutorial, we will present an overview of the goals for benchmarking applications, open-source solutions for benchmarking applications on Google Cloud, how to add benchmarking to continuous integration workflows, and how to visualize and track benchmark statistics over time. Following the presentation, we will walk through a hands-on codelab (virtually) alongside tutorial attendees, that shows how to modify a mini-app’s repository for continuous benchmarking. This will include scripts for container or VM image baking, infrastructure-as-code for solution deployment, and a Cloud Build pipeline. Additionally, we will walk through visualizing benchmarking results using DataStudio, a web-based platform for dashboarding data held in Google Cloud’s BigQuery. Following the hands-on tutorial, we will break for interactive discussion on desirable continuous benchmarking practices, including (but not limited to) which metrics need to be measured and recorded, how a benchmarks database can be used to guide improvements in an applications performance and portability, and when to include detailed hotspot or system events profiles. TUTORIAL OBJECTIVES In the Leveraging Google Cloud for Continuous Benchmarking, attendees will learn how to: -Use Google Cloud Build to drive automated building and testing of a research application -Use the RCC-Run builder to add benchmarking to a Cloud Build build pipeline -Create a dashboard with Data Studio to visualize application performance data generated by RCC-Ru | |||||||||||||||||
27 | Sameer Shende | E4S: Extreme-Scale Scientific Software Stack | The DOE Exascale Computing Project (ECP) Software Technology focus area is developing an HPC software ecosystem that will enable the efficient and performant execution of exascale applications. Through the Extreme-scale Scientific Software Stack (E4S) [https://e4s.io], it is developing a comprehensive and coherent software stack that will enable application developers to productively write highly parallel applications that can portably target diverse exascale architectures. E4S provides both source builds through the Spack platform and a set of containers that feature a broad collection of HPC and AI/ML software packages that target GPUs from three vendors (Intel, AMD, and NVIDIA). E4S exists to accelerate the development, deployment, and use of HPC software, lowering the barriers for HPC and AI/ML users. It provides container images, build manifests, and turn-key, from-source builds of popular HPC software packages developed as Software Development Kits (SDKs). This effort includes a broad range of areas including programming models and runtimes (MPICH, Kokkos, RAJA, OpenMPI), development tools (TAU, HPCToolkit, PAPI), math libraries (PETSc, Trilinos), data and visualization tools (Adios, HDF5, Paraview), and compilers (LLVM), all available through the Spack package manager. The tutorial will use AWS for the hands-on session and will show the users how they can create custom conta iners starting with the base GPU images provided by E4S. It will allow them to install E4S on their bare-metal systems. | |||||||||||||||||
28 | David Bernholdt, Anshu Dubey, Rinku Gupta, Pat Grubel, Greg Watson, David Rogers | TUTORIAL: Better Scientific Software | Producing scientific software is a challenge. The high-performance modeling and simulation community, in particular, is dealing with the confluence of disruptive changes in computing architectures and new opportunities (and demands) for greatly improved simulation capabilities, especially through coupling physics and scales. At the same time, computational science and engineering (CSE), as well as other areas of science, are experiencing increasing focus on scientific reproducibility and software quality. Computer architecture changes require new software design and implementation strategies, including significant refactoring of existing code. Reproducibility demands require more rigor across the entire software endeavor. Code coupling requires aggregate team interactions including integration of software processes and practices. These challenges demand large investments in scientific software development and improved practices. Focusing on improved developer productivity and software sustainability is both urgent and essential. This two-part tutorial distills multi-project and multi-years experience from members of the IDEAS Productivity project and the creators of the BSSw.io community website. The main tutorial will spend half a day providing information about software practices, processes, and tools explicitly tailored for CSE. Topics to be covered include: Agile methodologies and tools, software design and refactoring, testing and continuous integration, Git workflows for teams, and reproducibility. Material will be mostly at the beginner and intermediate levels. The second part of the tutorial will be an attendee-driven session with support from the tutorial team. Breakout opportunities will include working through hands-on activities based on the tutorial presentations, “bring your own code” discussions based on software engineering experiences and challenges in your own software projects, and more general discussions about software practice and experience. Breakouts will be organized based on attendee interest and the number of tutorial team members available. TUTORIAL OBJECTIVES Participants should be able to: -Describe a range of methods and strategies to improve software development processes, working towards better developer productivity, software sustainability, and scientific reproducibility. -Be able to customize an approach for tailoring software development processes to the particulars of your project team and explain software value-related trade-offs. -Increase motivation, inspiration, and awareness of resources to help you in working towards producing better scientific software, and thus better scientific outcomes, in your own projects. | |||||||||||||||||
29 | Vas Vasiliadis | Tutorial: Scalable Automation of Data Management Tasks | Globus is widely used among the atmospheric, climate and earth sciences community for reliable data transfer, but a growing number of computationally intensive research activities require commensurate large-scale data management. Globus platform services, combined with data distribution platforms that use the Modern Research Data Portal design pattern, can greatly simplify the development and execution of automated data management tasks. At the SEA 2021 conference we presented an overview of Globus platform services that facilitate the construction of automated flows, using our work with multiple instrument facilities as exemplars. In this tutorial we will focus on the APIs that facilitate integration with other systems to deliver unique data management capabilities. Our objective is to provide additional technical depth and hands-on exercises for attendees to build their own flows for moving data, running analyses, and sharing outputs with collaborators. We will also illustrate how these flows can feed into downstream data portals, science gateways, and data commons, enabling search and discovery of data by the broader community. | |||||||||||||||||
30 | ||||||||||||||||||||
31 | ||||||||||||||||||||
32 | ||||||||||||||||||||
33 | ||||||||||||||||||||
34 | ||||||||||||||||||||
35 | ||||||||||||||||||||
36 | ||||||||||||||||||||
37 | ||||||||||||||||||||
38 | ||||||||||||||||||||
39 | ||||||||||||||||||||
40 | ||||||||||||||||||||
41 | ||||||||||||||||||||
42 | ||||||||||||||||||||
43 | ||||||||||||||||||||
44 | ||||||||||||||||||||
45 | ||||||||||||||||||||
46 | ||||||||||||||||||||
47 | ||||||||||||||||||||
48 | ||||||||||||||||||||
49 | ||||||||||||||||||||
50 | ||||||||||||||||||||
51 | ||||||||||||||||||||
52 | ||||||||||||||||||||
53 | ||||||||||||||||||||
54 | ||||||||||||||||||||
55 | ||||||||||||||||||||
56 | ||||||||||||||||||||
57 | ||||||||||||||||||||
58 | ||||||||||||||||||||
59 | ||||||||||||||||||||
60 | ||||||||||||||||||||
61 | ||||||||||||||||||||
62 | ||||||||||||||||||||
63 | ||||||||||||||||||||
64 | ||||||||||||||||||||
65 | ||||||||||||||||||||
66 | ||||||||||||||||||||
67 | ||||||||||||||||||||
68 | ||||||||||||||||||||
69 | ||||||||||||||||||||
70 | ||||||||||||||||||||
71 | ||||||||||||||||||||
72 | ||||||||||||||||||||
73 | ||||||||||||||||||||
74 | ||||||||||||||||||||
75 | ||||||||||||||||||||
76 | ||||||||||||||||||||
77 | ||||||||||||||||||||
78 | ||||||||||||||||||||
79 | ||||||||||||||||||||
80 | ||||||||||||||||||||
81 | ||||||||||||||||||||
82 | ||||||||||||||||||||
83 | ||||||||||||||||||||
84 | ||||||||||||||||||||
85 | ||||||||||||||||||||
86 | ||||||||||||||||||||
87 | ||||||||||||||||||||
88 | ||||||||||||||||||||
89 | ||||||||||||||||||||
90 | ||||||||||||||||||||
91 | ||||||||||||||||||||
92 | ||||||||||||||||||||
93 | ||||||||||||||||||||
94 | ||||||||||||||||||||
95 | ||||||||||||||||||||
96 | ||||||||||||||||||||
97 | ||||||||||||||||||||
98 | ||||||||||||||||||||
99 | ||||||||||||||||||||
100 |