ABCDEFGHIJKLMNOPQRSTUVWXYZ
1
poster sessionFirst name
Middle Initial
Last name
Organization
NSF Award Title
NSF Award Number
Abstract (150 words maximum)
Keywords
2
1 - Tuesday, 10-11 amZlatanAksamija
University of Utah
CDS&E: Coupled Electro-Thermal Transport in Two-Dimensional Materials and Heterostructures
2302879
The broad and growing cohort of 2D materials now spans graphene, hexagonal boron nitride (hBN), transition metal dichalcogenides (TMDs), phosphorene, and many others. While they may enable further scaling of nanoelectronic devices, their relatively weak van der Waals (vdW) interlayer bonds may lead to trapping the heat that is an inevitable byproduct of all electronic conduction and computation. Simulating thermal dissipation inside 2D nanostructures requires a two-way coupled treatment of electrons and heat, the latter being transported by lattice vibrations (phonons). Computing electron-phonon (el-ph) coupling from first principles and solving the electron and phonon Boltzmann transport equations using ab initio inputs has now reached full maturity. In this project, we are developing a platform to study non-isothermal transport, where electron and phonon populations are simulated concurrently so that phonons generated by el-ph coupling are tracked and their distribution/temperature fed back into electron transport and vice versa.
two-dimensional materials, heterostructurs, phonons, transport, first principles, monte carlo
3
1 - Tuesday, 10-11 amM. JoanAlexander
NorthWest Research Associates
Improving the Understanding and Representation of Atmospheric Gravity Waves using High-Resolution Observations and Machine Learning
2004512
Atmospheric gravity waves (GWs) play an important role in the exchange of momentum between the Earth's surface and the free atmosphere. They are excited by flow over topography, convective systems, and fronts, then propagate upward and horizontally. When they break at higher altitudes, momentum imparted to the waves at their generation level is deposited, playing an important role in the momentum budget of the troposphere and stratosphere. The process must be “parameterized” (estimated) for climate models based on the resolved flow. Current state-of-the-art parameterizations are hamstrung by computational limitations and the scarcity of observations. This project (https://cssi-gws.github.io/index.html) brings together experts in atmospheric dynamics, climate modeling, machine learning (ML), and data science and engineering to pursue four objectives: (1)Loon balloon data analysis and data portal archive. (2)High-resolution model simulations archived and shared. (3)Use ML techniques to develop data-driven parameterization schemes for GW momentum fluxes and drag. (4)Establish a framework for implementing and testing ML-based parameterizations in atmospheric models.
Climate model, sub-grid-scale waves, machine-learning
4
1 - Tuesday, 10-11 amAncaAndrei
Tufts University
Elements: Morpho-Cyberinfrastructure for scientists and engineers studying shape change
2003820
An emerging theme across many domains of science and engineering is modeling materials that can change shape. With Morpho, domain scientists gain a powerful new simulation tool that enables them to tackle larger and more complex shape evolution problems than presently possible. One example involves modeling the evolution of nematic liquid crystals with free boundaries, known as nematic tactoids, that are in contact with an isotropic fluid. In this work, we present results from applying a class of classical nonlinear numerical methods to this model and compare them with previously used gradient-descent methods. Moreover, by wrapping the algorithms in a multilevel nested iteration approach, we see significant improvements in efficiency of the simulations with a variety of initial guesses.
Morpho, shape optimization, nonlinear methods, nested iteration
5
3 - Wednesday, 1030-1130RafalAngryk
Georgia State University
Elements: Comprehensive Time Series Data Analytics for the Prediction of Solar Flares and Eruptions
NSF-OAC-1931555
We present our ongoing research progress from the Data Mining Lab at Georgia State University, supported by the NSF-OAC-1931555 award. Our focus is on improving solar flare predictions through data cleansing of the extensive Space Weather Analytics for Solar Flares (SWAN-SF) benchmark dataset. Anticipating solar flares accurately and promptly is pivotal due to their potential societal impacts. Our methodology incorporates Isolation Forest (iForest), a tree-based outlier detection algorithm. It targets the predominantly abundant N-class (non-flare) instances within SWAN-SF, identifying and eliminating anomalies prior to training. We employ TimeSeriesSVC (TSSVC), an SVM classifier for time series data, utilizing both original and purified (small number of N-class instances removed, through our contamination rate determination experiments) versions of SWAN-SF data. Assessment in our experiments is based on true skill score (TSS) and updated Heidke skill score (HSS2) for varying contamination rates. Our results exhibit significant solar flare predictions’ improvement post iForest's outlier removal. Intriguingly, adding these outliers to flare-labelled samples yields further enhancements.
Time Series Data Analytics, Prediction of Solar Flares, Space Weather
6
1 - Tuesday, 10-11 amRituArora
Wayne State University
Elements: Basil: A Tool for Semi-Automatic Containerization, Deployment, and Execution of Scientific Applications on Cloud Computing and Supercomputing Platforms
2314203
Basil is a tool for semi-automatic containerization, deployment, and execution of applications and workflows on cloud computing and supercomputing platforms. Basil can be used to build ready-to-use Docker/Singularity images without having to first learn about the process of creating the images. Users can provide the recipes for building their applications/workflows in one of the following forms: (1) Makefiles/CMakefiles, (2) scripts, (3) commands, or (4) a text-file with predefined keywords and notations (using the templates provided by us). Using these recipes (e.g., in Makefiles or text-files), Dockerfiles or Singularity definition files are generated automatically. A generated Dockerfile or Singularity definition file is then used to build a Docker or Singularity image. Next, the image is scanned for any vulnerabilities, signed, and if the user desires, released in public registries with appropriate licenses. These generated container images can be tested using the Basil web portal, and can be pulled to run or deploy on diverse hardware platforms on-prem or in the cloud.
Containerization, Docker, Singularity, automation
7
2 - Tuesday, 5-7 pmRituArora
Wayne State University
COLLABORATIVE RESEARCH: EAGER: Towards Building a CyberInfrastructure for Facilitating the Assessment, Dissemination, Discovery, & Reuse of Software and Data Products
2314202
The overarching goal of this project is to develop a software infrastructure for facilitating the assessment, discovery, dissemination, and reuse of publicly accessible software and data products. As a preliminary step towards meeting this goal, this project has initiated research and development activities for prototyping: (1) iTracker: the software infrastructure for tracking the user-defined metrics of products released and deployed on different platforms & computing environments, (2) CompChecker: a license and software-stack compatibility checker for advising the users on the feasibility of integrating or interoperating with existing products, and (3) Discovery Catalog: a prototype of a catalog of NSF-funded products which can display the most recent information captured by iTracker for each product of interest and integrate CompChecker as a feature.
Opuntia, metrics, software reuse, software interoperability, legal compatibility, licenses
8
1 - Tuesday, 10-11 amBerkayAydin
Georgia State University
Elements: Spatiotemporal Analysis of Magnetic Polarity Inversion Lines (STEAMPIL)
2104004
Extreme space weather events have the potential to cause significant disruptions in a wide array of technological systems, encompassing radio communications, telecommunication and navigation satellites, electrical power grids, space operations, and even commercial airline flights. The STEAMPIL project aims to establish an object detection and data analysis pipeline while investigating the influence of solar magnetic polarity inversion lines (MPILs) and associated shape-based characteristics derived from solar magnetograms. These distinctive features are utilized to facilitate the understanding and prediction of the intense solar eruptive activity. As an integral aspect of this project, we investigate the predictive feasibility of using shape-based characteristics from solar magnetograms with a strong focus on MPIL-based features for operational space weather forecasting. This initiative also involves the development of innovative methods to construct data and prediction cyberinfrastructure, addressing the limitations of current space weather forecasting approaches and advances their capabilities.
space waether forecasting; magnetograms; predictive analytics; data pipeline
9
3 - Wednesday, 1030-1130RyanSBaker
University of Pennsylvania
Collaborative Research: Frameworks: Cyber Infrastructure for Shared Algorithmic and Experimental Research in Online Learning
DRL-1931419
Project “Collaborative Research: Frameworks: Cyber Infrastructure for Shared Algorithmic and Experimental Research in Online Learning” is developing shared infrastructure for the educational research community, combining tools for running automated experiments with data infrastructure for studying the result of online learning. Using a combination of the ASSISTments learning platform and the MORF MOOC data platform (including data from the University of Pennsylvania’s extensive offerings on Coursera and edX), our project makes data and A/B testing available to dozens of researchers, including making MOOC data that is hard to deidentify available to external researchers both through a privacy-protecting platform and manual deidentification of the largest released MOOC discussion forum post dataset to date. This year, our project has successfully supported a range of studies and received an open data set award.
MOOC, A/B testing, data enclave, privacy-protecting data infrastructure
10
1 - Tuesday, 10-11 amKlausRBartschat
Drake University
Elements: NSCI-Software – A General and Effective B-Spline R-Matrix Package for Charged-Particle and Photon Collisions with Atoms, Ions, and Molecules
1834740
This project concerns the development and distribution of a suite of computer codes that can accurately describe the interaction of charged particles (mostly electrons) and light (mostly lasers and synchrotrons) with atoms and ions. The results are of importance for the understanding of fundamental collision dynamics, and they also fulfil the urgent practical need for accurate atomic data to model the physics of stars, plasmas, lasers, and planetary atmospheres. With the rapid advances currently seen in computational resources, such studies can now be conducted for realistic systems, such as transition metals and other open-shell systems. The source code resulting from this project is publicly available. A website devoted to user-developer interaction is available (https://github.com/zatsaroi) and maintained together with the necessary code documentation and training materials. The latter is currently being done on the Atomic, Molecular, and Optical Science Gateway (https://amosgateway.org/).
R-matrix, B-splines, electron- and photon-driven processes, open source, parallel computing
11
2 - Tuesday, 5-7 pmKlausBartschat
Drake University
Frameworks: An Advanced Cyberinfrastructure for Atomic, Molecular, and Optical Science (AMOS): Democratizing AMOS for Research and Education
2311928
A challenge facing the Atomic, Molecular, and Optical Science (AMOS) community is the lack of a coordinated approach to using and sharing the computational tools that have grown organically within the community. The goal of this project is to create a comprehensive cyberinfrastructure (CI), through which AMO scientists can access resources for computational AMOS via the AMOS gateway. Our prototype hosts seven software suites with applications including the computation of electron collision and photoionization cross sections, and control of atomic and molecular systems by laser-atom/molecule interactions. The gateway is powered by an advanced CI to enable a flexible and easy-to-use platform for the broad AMOS community, as well as researchers and educators who are not computational AMOS scientists. The AMOS Gateway will serve as an excellent vehicle to educate students in computational AMOS via hands-on calculations, and as a hub for material created by developers for teaching, workshops, and conferences.
Science Gateway, Physics, Open Source, Parallel Computing, Community Building
12
3 - Wednesday, 1030-1130SeanMBergin
Arizona State University
Frameworks: Collaborative Research: An Integrative Cyberinfrastructure Framework for Next- Generation Modeling Science
2103905
This Integrative Cyberinfrastructure Framework (ICF) project is intended to support and advance next generation modeling of human and natural systems. Through the development of computational tools, educational initiatives and an array of activities this program continues to advance the development of computational modeling of social and ecological systems. 1) GitHub templates are being finalized for standardization and containerization to facilitate reuse and validation of model code. 2) Containers for common modeling platform and example scripts are now available to facilitate the use of high-throughput computing for modeling using resources such as the Open Science Grid. 3) Online training materials continue to be expanded in order to build expertise in the use of cybertools to develop and analyze models of social and ecological systems. 4) This ICF project also extends efforts to encourage researchers to follow best practices while scaffolding new and innovative science.
Human and Natural Systems, Science Gateway, Computational Modeling, Best Practices for Scientific Software
13
1 - Tuesday, 10-11 amAmneet PalS.Bhalla
San Diego State University
Collaborative Research: Frameworks: Multiphase Fluid-Structure Interaction Software Infrastructure to Enable Applications in Medicine, Biology, and Engineering
OAC 1931368, OAC 1931516, OAC 1931372, OAC 1931524
IBAMR is software for simulating fluid dynamics, solid mechanics, and fluid-structure interaction (FSI). It offers leading community implementations of the immersed boundary (IB) method and its extensions, and it achieves high performance through its support for block-structured adaptive mesh refinement (AMR). This project enabled us to develop new computational infrastructure within IBAMR, including: 1) support for additional geometrical descriptions, including CAD-based geometries, for the immersed structures; 2) efficient adaptive multiphase flow solvers, including for low Mach number reacting flows and multiphase complex fluid flows; 3) new algorithms for rigid- and flexible-body FSI using interfacial fluid-structure coupling methods; and 4) scalable fluid-solid partitioning and coupling algorithms to provide high performance at extreme scales. Each of these capabilities is motivated by and anchored in specific applications led by one of the project PI/Co-PIs, including cardiovascular, esophageal, and pulmonary medicine and biology; wave energy converters; and additive manufacturing.
Multiphase Flows, Computational Fluid Dynamics, Fluid-Structure Interaction, Adaptive Mesh Refinement
14
1 - Tuesday, 10-11 amSanjuktaBhowmick
University of North Texas
Collaborative Research: Framework Implementations: CSSI: CANDY: Cyberinfrastructure for Accelerating Innovation in Network Dynamics
2104076
We present the CANDY (Cyberinfrastructure for Accelerating Innovation in Network Dynamics) framework for analyzing properties of dynamic networks. CANDY introduces a template for creating parallel algorithms that can efficiently update a graph property in a dynamic network. The template takes the set of changed edges and the information from the previous step as input, and updates the given graph property. The template follows two main steps, First, identifying the affected vertices by parallelly processing of each changed edge. Second, updating the property of the affected vertices. As the updating operation can lead to new vertices being affected, the second step is applied iteratively until no affected vertices remain. In the poster we present performance results of updating different network properties (Single Source Shortest Path, Minimum Weighted Spanning Tree, Vertex Coloring, PageRank, Strongly Connected Components) as well as our work on generating synthetic dynamic graphs that mimic real world data.
Dynamic Graphs, HPC, Graph Generator
15
1 - Tuesday, 10-11 amVolkerBlum
Duke University
DMREF: Collaborative Research: HybriD3: Discovery, Design, Dissemination of Organic-Inorganic Hybrid Semiconductor Materials for Optoelectronic Applications
1729297
This project focuses on design, discovery and dissemination ("D3") of hybrid organic-inorganic semiconductor (HOIS) materials, particularly those inspired by the paradigm of crystalline perovskites. The CSSI related component of the project is a curated database of HOIS materials. The database itself contains property data for over 500 materials at this time, with the ability to flexibly incorporate essentially any property type. The software behind the database, called MatD3, is general and was published as a separate open-source project, enabling materials databases for small workgroups or teams, e.g., to make data available in a searchable form in a browser, or through a documented REST API. The data in the HybriD3 database is openly available and is additionally incorporated into the much larger, long-term sustainable collection SpringerMaterials.
Materials database, perovskites, curated data
16
1 - Tuesday, 10-11 amBrianBockelman
Morgridge Institute for Research
CSSI Elements: EWMS - Event Workflow Management Service
2103963
Scientists and Engineers are deploying ever more powerful instruments to collect signals from the physical world. Much of these signals are independent observations of physical phenomena, also known as “events”. The Event Workflow Management Service (EWMS) provides a key element in an evolving framework that is using the abstraction of events to provide the nation’s Science and Engineering community with an effective means to store, manage, and process the measured data. EWMS offers integrated, science-driven, and intelligent data and workflow management capabilities based on the Manager-Worker paradigm and treats the measured data as very large collections of statistically independent entities. EWMS increases computational throughput and lowers the hurdle for using distributed resources for data-intensive computational workflows. Increased computational throughput is achieved by incorporating knowledge of the resource requirements of an event or group of events into the scheduling decision making process. The data management features of EWMS reduce or completely remove data management concerns from the application workflow. EWMS employs existing web-scale technologies, such as message queues, to distribute data and task across distributed resources. This allows EWMS to support computational workloads of modern scientific collaborations.
manager-worker, message queues, distributed computing
17
2 - Tuesday, 5-7 pmBrianPBockelman
Morgridge Institute for Research
Elements: Kingfisher: Storage Management for Data Federations
2209645
Storage is an important yet scarce resource within a distributed computing system. However, when compared to other resources – CPUs, GPUs, or memory – there’s a paucity of policies and mechanisms available to administrators for management. What is typically available – quotas – rarely provide considerations for how storage can be reclaimed by the owner. This project develops a storage management framework, LotMan, around the concept of a storage “lot”. A lot is a finite amount of storage that has an owner, can be sub-allocated, and has a reclamation policy outlining how storage will be released. The LotMan framework is being integrated with a file caching service (based on the XRootD software) and into the HTCondor Software Suite for integrating the concept of storage management into the computing workflow. We’ll present the progress on these integrations and outline the next steps for making storage management first class within the distributed computing ecosystem.
distribute computing, storage management, object caching
18
1 - Tuesday, 10-11 amTomBoettcher
University of Cincinnati
Extending the physics reach of LHCb by developing and deploying algorithms for a fully GPU-based first trigger stage
2004364 and 2004645
The LHCb experiment is designed to search for new physical phenomena  in proton-proton collisions at the LHC. To do this, the LHCb experiment processes proton-proton collisions at a rate of 30 MHz, producing a data volume of about 5 TB/s. This data volume must be reduced by orders of magnitude to allow for long-term storage. The reduction is performed by analyzing the data in real time using a two-stage software trigger that selects a small subset of data for storage. The first stage, called Allen, processes the full 5 TB/s of data produced by the LHCb detector using GPUs. This GPU-based trigger offers improved speed and flexibility, allowing the LHCb experiment to extend its physics reach using previously unfeasible reconstruction and analysis techniques.
GPUs, real-time analysis, particle physics, machine learning
19
2 - Tuesday, 5-7 pmMichelleABorkin
Northeastern University
Collaborative Research: Elements: Enriching Scholarly Communication with Augmented Reality
2209624
This project leverages the tremendous investments in AR made by the corporate world over the past several years to allow astronomers to see and explore the 3D Universe. No expensive equipment beyond smartphones and tablets is needed. Over the course of the project, our interdisciplinary team of astronomers and computer scientists will design, test, and deploy an efficient and effective end-to-end system for embedding augmented reality figures in scholarly journals. By enriching scholarly communication, this new AR-based system is expected to accelerate the pace of scientific discovery. The system’s cyberinfrastructure innovation will ultimately open completely new channels for communication amongst all who rely on effective communication of high-dimensional data. The first year of the project has focused on extensive user testing of the existing AR technologies as well as evaluation of its ease-of-use and effectiveness to determine the system’s design requirements and interactivity.
Augmented reality, astronomy, publishing, visualization
20
2 - Tuesday, 5-7 pmMarkBowman
Las Cumbres Observatory
Frameworks: Target and Observation Manager Systems for Multi-Messenger and Time Domain Science
2209852
Astrophysics is now able to explore a range of time-variable phenomena, including discoveries from gravitational wave and neutrino detectors. It is often critical to obtain additional data to characterize selected targets within hours of discovery. The high data rate makes software tools invaluable for this task. Target and Observation Manager (TOM) systems help researchers to collate information on the specific targets of interest from community services, request additional observations, and coordinate their efforts with those of other teams in real-time. The TOM Toolkit is an open-source software package designed to make it easier for researchers to build and customize a TOM system for their own science. We will describe recent upgrades to the TOM Toolkit and discuss how it isdesigned to interact with other key services in the astronomical community.
infrastructure for coordinating teams within research communities; managing research in the Big Data era; software toolkits enhance research by diverse community; multi-messenger astrophysics; Rubin Observatory
21
1 - Tuesday, 10-11 amCatherineBrinson
Duke University
Nanocomposites to Metamaterials: A Knowledge Graph Framework
CSSI-1835677
This collaborative project led by 5 academic institutions entails the ongoing development of an open-source, extensible, semantic cyberinfrastructure (CI) following FAIR principles for Materials research (MaterialsMine), with specific application to polymer nanocomposites (NanoMine) and metamaterials (MetaMine), comprising a database to store semi-automated curated data with associated data validation protocols, along with a Materials Knowledge Graph (KG) for ontology-enabled exploration, discovery, and design tools and custom user dashboards. Recent developments include restructuring of the modularized backend for optimal speed and performance; streamlined, automated workflows for ongoing CI development, including expansion of the semantic KG framework and data validation protocols; data analysis and design tools available to the materials research community; new search and visualization interfaces for users with varying levels of technical expertise; new workflows in data curation, including NLP-enabled extraction; open API endpoints with associated documentation for external users; and case studies in both metamaterial and polymer-nanocomposite domains.
database, materials science, knowledge graph, metamaterials, composites
22
1 - Tuesday, 10-11 amDavidCantu
University of Nevada, Reno
Elements: The ThYme database and identifying representative amino acid sequences that originate thioester-active enzyme families
2001385
The thioester-active enzyme (ThYme) database provides sequences and tertiary structures of enzymes active on thioester-containing substrates, classified into families by sequence similarity. ThYme allows making structural, functional, and mechanistic inferences on any sequence in a family; this is of interest since most known sequences have not been experimentally studied, and each family in ThYme is based on sequences that have been experimentally-verified to have a certain function. ThYme has new families and/or updated sequences and structures of: thioesterases, acyl transferases, acyl-CoA synthases, ketoacyl synthases, acyl carrier proteins, ketoacyl reductases, hydroxyacyl dehydratases, and enoyl reductases. ThYme has new functionalities: users can narrow down results by family, whether sequences have known structures, or whether any experimental work has been done. Each unique sequence is mapped protein identifiers and gene identifiers. The ThYme database also allows to search by FASTA sequence as well as an interface for performing protein BLAST searches.
Enzyme, protein, database, structure, bioinformatics
23
2 - Tuesday, 5-7 pmLeiCao
University of Arizona
Collaborative Research: ELEMENTS: Tuning-free Anomaly Detection Service
2103832
Anomaly detection is critical in many scientific and engineering fields ranging from identifying signatures of new cyberattacks to detecting seizures in EEG medical time series data sets. However, although previously developed research has offered a plethora of anomaly detection algorithms, effective anomaly detection remains challenging for domain experts due to the manual tuning process: they have to manually engineer the features to prepare data; they have to determine which among these many algorithms is best suited to their particular domain; they have to tune many parameters by hand to make the chosen algorithm perform well. This is a challenging task, because domain experts often lack sufficient understanding of different detection algorithms and in-depth knowledge of the machine learning process. This project addresses this wide-spread problem by developing a robust self-tuning anomaly detection cyber-infrastructure (STAND). It enables scientists and engineers who have little understanding about anomaly detection techniques to effectively make use of them. It supports a rich variety of NSF communities, including but not limited to Medicine/Biology, Mechanical Engineering, and Cyber Security, all of which face data challenges requiring effective anomaly detection.
Anomaly Detection, Tuning Free
24
2 - Tuesday, 5-7 pmFranckCappelloU. Chicago
FZ: A fine-tunable cyberinfrastructure framework to streamline specialized lossy compression development
2311875
Scientific data reduction is necessary because scientific simulations and experiments produce more data than can be stored, communicated, and analyzed. Existing generic lossy compressors often do not correspond to user-specific use cases and requirements regarding reduction, speed, and information preservation. Hence, many research groups develop their own specialized lossy compressor, an effort that requires tremendous collaborations between compressor experts and domain scientists, demands extensive coding to optimize performance, and often leads to redundant research and development efforts. The FZ project works with 10 science partners to create a framework that will transform the development of specialized lossy compressors. FZ will provide a comprehensive ecosystem to intuitively research, compose, implement, and test specialized lossy compressors from a library of predeveloped, high-performance data reduction modules optimized for heterogeneous platforms. This project gathers four institutions: the University of Chicago, Indiana University, Florida State University, and Ohio State University.
Scientific data reduction, customized compression
25
3 - Wednesday, 1030-1130HenriCasanova
University of Hawaii
Simulation-driven Evaluation of Cyberinfrastructure Systems
2103489
A key difficulty when designing and evolving CI runtime systems is the need to perform extensive experimental evaluations for a broad spectrum of relevant execution conditions. Real-world experiments are resource-, energy-, and labor-intensive, and are necessarily of limited scope. The alternative is to evaluate CI runtime systems in simulation. The WRENCH project provides a simulation framework that comes with already-implemented abstractions for current and emerging CI services. These abstractions make it possible to develop in-simulation versions of complex CI systems with minimal implementation effort. WRENCH builds on the popular SimGrid framework, which provides scalable and accurate low-level simulation abstractions. WRENCH provides a REST API that can be used to develop simulators in a language-agnostic manner. Using the API, one of the goals of this project is to demonstrate the feasibility of enhancing CI runtime systems with simulation capabilities, which we will demonstrate with workflow management systems.
Simulation of CI systems
26
1 - Tuesday, 10-11 amAnkitChakraborty
The University of Texas at Austin
Elements:Software A Scalable Open-Source hp-Adaptive FE Software for Complex Multiphysics Applications
2103524
The hp3D finite element (FE) library is a tool for computational modeling of engineering applications. The library provides a framework for discretization of three-dimensional multiphysics problems described by systems of partial differential equations. hp3D can be installed and runs efficiently on various compute architectures, from laptops to state-of-the-art supercomputers. The first thrust of the CSSI project focuses on the development and parallelization of an adaptive multigrid solver based on the Discontinuous Petrov-Galerkin (DPG) FE method. This unique solver is specifically designed for numerical solution of challenging high-frequency wave propagation problems in acoustics, electromagnetics, and elastodynamics which are currently intractable to solve at large scale. The second thrust aims to maximize the impact and reach of our FE software within the user community; towards that goal, we simplify the user experience, deliver model problem implementations, publish documentation along with the dissemination of the software, and organize workshops for our stakeholders.
finite element; hp-adaptivity; DPG; multiphysics; numerical solvers
27
1 - Tuesday, 10-11 amKyleChard
University of Chicago
Collaborative Research: Sustainability: A Community-Centered Approach for Supporting and Sustaining Parsl
2209919
Parsl is an open source software package for parallelizing Python programs and running them scalably and efficiently on small to very large local and remote resources (e.g., laptops, clusters, clouds, and supercomputers). Parsl allows developers to annotate functions in Python programs indicating opportunities for parallelism. When invoked, these functions are executed asynchronously and return a future to the program. Futures may be passed between functions creating a workflow graph. Parsl implements a flexible runtime model that supports different resource providers (e.g., HPC or cloud) as well as execution models (e.g., high throughput or extreme scale). This project is transitioning Parsl to a community-governed and community-supported open source project, with future income to be managed by a nonprofit organization under the direction of an elected Parsl Coordination Committee. The project is delivering a sustainable Parsl community by a) targeted technical activities that reduce costs and barriers for contribution, reducing future maintenance costs; b) building the Parsl community via outreach, engagement, and education programs, increasing potential future contributors; and c) establishing pathways and incentives to convert users to contributors and contributors to leaders, growing the next generation of the community.
Parsl, Parallel Python, Sustainability
28
2 - Tuesday, 5-7 pmKyleChard
University of Chicago
Frameworks: Collaborative Research: ChronoLog: A High-Performance Storage Infrastructure for Activity and Log Workloads
2104013
The ChronoLog project addresses the challenges posed by the massive amounts of activity data generated by modern scientific and industrial applications. Activity data, describing events rather than static entities, are produced at an unprecedented scale by sensors, scientific instruments, and system monitoring tools. ChronoLog is a distributed, tiered shared log storage ecosystem designed for efficient storage and processing of this data. It uses physical time for log entry distribution and ordering, eliminating costly synchronization. Multiple storage tiers enable elastic capacity scaling. The project develops novel algorithms and methodologies for managing, storing, and retrieving activity data, offering features such as strong consistency, durability, failure atomicity, transactional isolation, and asynchronicity. ChronoLog serves as a foundation for scalable plugins, including a SQL-like query engine, a streaming processor, a log-based key-value store, and a log-based TensorFlow module. This open-source software ecosystem supports diverse applications and addresses the scale and complexity of future systems.
Log Store, Activity Data, Distributed Logging, Tiered Storage, Data Ordering
29
2 - Tuesday, 5-7 pmKyleChard
University of Chicago
funcX: A Function Execution Service for Portability and Performance
2004894
Globus Compute (previously funcX) is a scalable and high-performance federated platform for managing the remote execution of programming functions across diverse cyberinfrastructure systems, from edge accelerators to clusters, supercomputers, and clouds. Globus Compute allows developers to decompose applications into collections of functions that can each be executed in the best location, in terms of cost, execution time, data movement costs, and/or energy consumption. It thus integrates the extreme convenience of the function as a service (FaaS) model, developed in industry for specific industry applications, with support for the specialized needs of scientific research. Globus Compute addresses important barriers to these new uses of research cyberinfrastructure systems, by enabling the intuitive, flexible, and scalable execution of functions without regard to physical location, scheduler architecture, virtualization technology, administrative domain, or data location. Flexible Globus Compute endpoint software makes it easy to expose arbitrary computing systems as FaaS computing platforms, thereby transforming existing cyberinfrastructure systems into high-performance function serving environments. The cloud-hosted Globus Compute service provides a REST interface for registering functions, discovering available endpoints, and managing the execution of functions on endpoints, all via a universal trust fabric and standard web authentication and authorization mechanisms.
FaaS, Serverless, Globus Compute
30
2 - Tuesday, 5-7 pmJingyi (Ann)Chen
The University of Texas at Austin
Collaborative Research: Elements: Monitoring Earth Surface Deformation with the Next Generation of InSAR Satellites: GMTSAR
2209808
Spaceborne InSAR satellites can retrieve surface deformation history with centimeter-to-millimeter level accuracy at 10-100-meter spatial resolution. Given the quantity and quality of InSAR data is expected to increase in the next 5 years, it is important to maintain and expand the capabilities of the existing open-source InSAR processing software GMTSAR. To promote the use of GMTSAR in a broader NSF science community, we provided additional installation options, developed a more user-friendly data processing pipeline, and organized the EarthScope GMTSAR summer short course (with 341 students enroleld in 2023). In this poster, we showcase recent science investigations and K-12 outreach education programs enabled by GMTSAR, and we discuss plans on future development of the software: (1) the development of the python package for GMTSAR; (2) the new Geocoded SLC processing flow; and (3) Persistent Scatter (PS) processing module for mitigating vegetation decorrelation artifacts.
Satellite Geodesy, Surface deformation, InSAR, Earth Science
31
1 - Tuesday, 10-11 amCharlesCheung
University of Delaware
Elements: Community portal for high-precision atomic physics data and computation
1931339
This project aims to bridge the gap between the development of atomic physics research codes and the need for data and software by the user community. To meet the needs of the community, we will (1) develop a scalable and sustainable online data portal with an automated interface for easy update and addition of new data, (2) continue the development of open-access atomic software based on our research codes that allow generating large volumes of data with automated accuracy assessments. The portal will provide energies, wavelengths, transition matrix elements, and rates for various transition types, branching ratios, lifetimes, polarizabilities, hyperfine constants, and other data. We plan to make data for over 100 atoms and ions, including high-charged ions, available for the user community by the end of this project. Building on a prototype developed in an initial project, the new portal will significantly advance scalability and sustainability. Present version of the portal is available at https://www.udel.edu/atom.
atomic physics, portal, high-performance computing
32
1 - Tuesday, 10-11 amIn HoCho
Iowa State University
Elements: Development of Assumption-Free Parallel Data Curing Service for Robust Machine Learning and Statistical Predictions
OAC-1931380
This project developed and shared the fractional hot-deck imputation (FHDI), a general-purpose assumption-free software, for handling missing values in broad scientific and engineering data. FHDI can cure multivariate missing data by filling each missing unit with observed values (thus, hot-deck) without resorting to distributional assumptions. However, handling ultra-incomplete data (i.e., concurrently big-n and big-p) with tremendous instances and high dimensionality has posed problems to FHDI. To tackle the challenges, we developed the ultra data-oriented parallel FHDI (named UP-FHDI). In addition to the parallel Jackknife method, UP-FHDI enables a computationally efficient ultra data-oriented variance estimation using parallel linearization techniques. Results confirm that UP-FHDI can tackle ultra data with more than 1 million instances and 10,000 variables, notably with uncertainty assessment. From diverse feasibility tests with multi-disciplinary data sets, UP-FHDI confirms its positive impact on the performance of subsequent machine learning (ML) methods and statistical inference, accelerating robust data- and ML-driven scientific and engineering innovations.
Ultra big incomplete data, missing data curing, imputation, uncertainty estimation, machine learning, statistical inference
33
1 - Tuesday, 10-11 amEunseoChoi
The University of Memphis
Elements: Developing an integrated modeling platform for tectonics, earthquake cycles and surface processes
2104002
This project aims to develop a modern simulation code for computational tectonic modeling. To this end, an existing open-source code, DES3D, is enhanced in terms of modeling capacity through coupling with those used in other domains such as seismology and geomorphology; sustainability through modern software engineering practices; and performance through transition to a extreme-scale parallel finite element library. In this poster, we present the achivements in coupling DES3D with a surface process modeling code and in upgrading the finite element backbone of DES3D to the parallel finite element library, MFEM. We also present the future plans for coupling DES3D with a seismology code and for enhancing the sustainability of DES3D through modern software engineering practices.
computational tectonics, tectonic-surface process coupling, earthquake cycle modeling
34
2 - Tuesday, 5-7 pmLaura ECondon
University of Arizona
Collaborative Research: Framework: Software: NSCI : Computational and data innovation implementing a national community hydrologic modeling framework for scientific discovery
1835794
Hydrologic simulations have advanced greatly in recent years. We now have models spanning the entire US that capture groundwater, surface water and plant processes at high (1km) spatial resolution. Despite these developments the community of people who use and develop national models has remained small due to multiple barriers to entry. Developing hydrologic input datasets and validating the behavior of national models is challenging and our current platforms have taken years to develop. Also, the modeling platforms can be difficult for new users to pick up without some training and computer science background. Finally, national simulations rely on high performance computing and generate very large outputs that can be difficult to manage. HydroFrame seeks to remove these barriers. We build tools to make existing national models community resources that anyone can interact with. Our primary goal is to facilitate easy interaction with large computationally intensive hydrologic models and massive simulated outputs. Our tools enable users to subset model inputs and outputs for any watershed in the US, setup their own simulations, and visualize and analyze existing model outputs or newly generated results. We also develop free interactive educational tools and lesson plans to teach students of all ages about groundwater and the hydrologic cycle.
hydrology, groundwater, modelling
35
1 - Tuesday, 10-11 amSergiuMDascalu
University of Nevada, Reno
CSSI: Elements: Innovating for Edge-to-Edge Climate Services
OAC - 2209806
Given the advances in cyberinfrastructure and the Internet of Things, environmental sciences have vast potential to provide the community with real-time valuable information, including on hazards such as wildfires and flooding. To facilitate access by everyone to environmental data, this project focuses on "democratizing" portions of an existing regional earthquake and wildfire science network in Nevada. The aim is to create transformative real-time crowd-participating environmental data services, assembled on a pilot Nevada Weather (NevWx) edge-to-edge platform developed in the project. Objectives attained in year 1 include: design of the network integration plan for deployment of sensor arrays and network equipment in the Lake Tahoe Basin mountainous areas; permit acquisition for sensor and equipment installation in the Caldor Fire Perimeter near South Lake Tahoe; design and implementation of a pilot edge-to-edge stack and software core; requirements established for the project's web portal; and design of a user study for acquiring stakeholders' feedback.
edge-to-edge CI, citizen science, environmental monitoring, natural hazards, data platform, climate, wildfire, sensor networks, wireless, LoRaWAN
36
3 - Wednesday, 1030-1130StevenCDeCaluwe
Colorado School of Mines
Frameworks: Extensible and Community-driven Thermodynamics, Transport, and Chemical Kinetics Modeling with Cantera: Expanding to Diverse Scientific Disciplines
1931584
Modeling and simulation play key enabling roles in accelerating the transition away from carbon-intensive and environmentally disruptive energy and chemical technologies. However, such software tools have not kept pace with the increasing chemical complexity and interdisciplinarity of advanced technology solutions. Software frameworks must rigorously connect thermodynamics, kinetics, and mass transport properties, but must also be flexible enough to incorporate new material models and mechanisms without requiring significant simulation changes. This presentation gives an overview of Cantera, an open-source software package for problems involving chemical kinetics, thermodynamics, and species transport, with broad applications in chemistry, materials science, and engineering. The software provides a library of generalizable function calls, which automates calculations to support flexible modeling of complex thermo-kinetic and mass transport phenomena. In this presentation, we will give an overview of Cantera’s capabilities and highlight recent developments in both the software and the user and developer communities.
Chemistry, thermodynamics, open source software
37
1 - Tuesday, 10-11 amEwaDeelman
University of Southern California
SI2-SSI: Pegasus: Automating Compute and Data Intensive Science
1664162
This project addresses the ever-growing gap between the capabilities offered by on-campus and off-campus cyberinfrastructures (CI) and the ability of researchers to effectively harness these capabilities to advance scientific discovery. Faculty and students on campuses struggle to extract knowledge from data that does not fit on their laptops or cannot be processed by an Excel spreadsheet and they find it difficult to efficiently manage their computations. The project sustains and enhances the Pegasus Workflow Management System, which enables scientist to orchestrate and run data- and compute-intensive computations on diverse distributed computational resources. Enhancements focus on the automation capabilities provided by Pegasus to support workflows handling large data sets, as well as usability of Pegasus that lowers the barrier of its adoption. This effort expands the reach of the advanced capabilities provided by Pegasus to researchers from a broader spectrum of disciplines that range from gravitational-wave physics to bioinformatics, and from earth science to material science.
scientific workflows, sustainability
38
1 - Tuesday, 10-11 amEugeneDePrince
Florida State University
Collaborative Proposal: Frameworks: Sustainable Open-Source Quantum Dynamics and Spectroscopy Software
OAC-2103705
Chronus Quantum (ChronusQ) is an open source software package which targets the solution of challenging problems that arise in ab initio electronic structure theory, with a special emphasis on time dependence and the self-consistent treatment of relativistic effects. The current phase of ChronusQ development focuses on the inclusion of nuclear quantum effects via the nuclear electronic orbital approach and on advanced electron correlation methods rooted in unitary variants of the coupled-cluster approach. This poster provides an overview of recent developments along both of these lines.
electronic structure, quantum chemistry
39
3 - Wednesday, 1030-1130ShengDi
University of Chicago
Collaborative Research: Elements: ROCCI: Integrated Cyberinfrastructure for In Situ Lossy Compression Optimization Based on Post Hoc Analysis Requirements
2104023
Error-bounded lossy compressors can significantly reduce the data volume while controlling data distortion have been developed for years. However, how to set an appropriate error bound for lossy compression is very challenging; how to select the best fit compression method is also non-trivial because of strengths and weaknesses of different compression techniques and diverse characteristics of datasets. The goal of this project -- ROCCI is to offer a complete series of automatic functions and services allowing users to transparently run the bestfit compressor at runtime. (1) ROCCI will build an efficient layer to to leverage an existing compression adaptor library (LibPressio) and compression assessment library (Z-checker) to fill the gap between lossy compression and user's requirement; (2) it develops an efficient engine to determine the best fit compressor with optimized settings; and (3) it develops a user-friendly infrastructure that integrates compression optimization and execution via the HDF5 dynamic filter mechanism.
lossy compression, scientific application, big data
40
3 - Wednesday, 1030-1130YiDing
The University of Texas at Dallas
CDS&E: Collaborative Research: Private Data Analytics Synthesis, and Sharing for Large-Scale Multi-Modal Smart City Mobility Research
2003874
Understanding real-time human mobility in urban areas has become increasingly important for many research areas from Mobile Networking, to Transportation/Urban Planning, Emergency Response, to recent Pandemic Mitigation. Many analytical models have been proposed. However, most of these data cannot be accessed by the research community at large. Fortunately, based on the expansion of urban infrastructures, such mobility data has been collected by city governments and companies that are willing to share the data for social good. However, a key challenge is the privacy concern since such data usually have sensitive information and system design details. To address this issue, the project aims to generate realistic yet synthetic mobility data through machine learning based on real mobility data analytics and then share these realistic synthetic data with the research community. The project aims to lower the entry barriers for interdisciplinary researchers aimed at addressing major scientific/societal challenges related to urban mobility.
Human Mobility, Data Synthesis
41
3 - Wednesday, 1030-1130OliverDunbar
California Institute of Technology
Collaborative Research: HDR: Data-Driven Earth System Modeling
1835860
The multiscale nature of climate necessarily requires approximations within all subcomponents, and may be based on machine learning, physics or both. Our approach to make accurate and trustworthy predictions of future, never-observed climate regimes combines physics-based models with subcomponents learned from accessible data, which are often only indirectly informative about the modeled processes. We have developed a suite of model-agnostic machine learning tools to learn about subcomponent models from such data. These tools rely on ensembles of model simulations, effectively carried out on GPUs in distributed systems; and our framework for assessing uncertainty of calibrations (CalibrateEmulateSample) requires 1,000 times fewer model evaluations than traditional approaches. Our approach has produced several scientific successes, such as a unified turbulence and cloud parameterization calibrated with a library of large-eddy simulations, a neural network snow model trained from station data, and calibration-directed development of a parameterization for upper-ocean turbulence (CATKE).
Climate modeling, machine learning, GPUs, parameterization
42
2 - Tuesday, 5-7 pmPeterElmer
Princeton University
S2I2: Institute for Research and Innovation in Software for High Energy Physics (IRIS-HEP)
OAC-1836650
IRIS-HEP aims to develop the state-of-the-art software cyberinfrastructure required for the challenges of data intensive scientific research at the High Luminosity Large Hadron Collider (HL-LHC) at CERN, and other planned HEP experiments of the 2020’s. These facilities are discovery machines which aim to understand the fundamental building blocks of nature and their interactions.
software high energy physics
43
3 - Wednesday, 1030-1130WillEngler
University of Chicago
Garden: A FAIR Framework for Publishing and Applying AI Models for Translational Research in Science, Engineering, Education, and Industry
2209892
Despite the transformative potential of AI/ML, its broad application is often hindered by the need for specialized infrastructure expertise. To overcome these barriers, we will develop the Garden Framework: tools for constructing and applying Model Gardens, collections of curated models linked with the data and computing resources required to advance the work of a specific community. Our objectives include: Streamlining model publishing, discovery, and usage; Offering automated tools for assessing model accuracy, performance, and FAIRness; Harnessing distributed cyberinfrastructure for scalable training and inference; and Developing dedicated Model Gardens for domains like materials science, physics, and chemistry. The Garden Framework builds on a range of NSF-supported innovations, including Globus and Parsl. Alongside technical development, we are engaging new communities in applied AI research. The "Informatics Skunkworks" program engages undergraduates in team-oriented data science investigations. A new "Moment" platform will connect research teams with tech-savvy students and professionals.
AI, ML, Inference, Reproducibility, Materials Science, Chemistry
44
2 - Tuesday, 5-7 pmJohn
Andrew
Evans
University of Colorado Boulder
Collaborative Research: Elements: EXHUME: Extraction for High-Order Unfitted Finite Element Methods
2104106
The overarching objective of this project is to construct a novel software library, EXHUME (EXtraction for High-order Unfitted finite element MEthods), to enable the use of classical finite element codes for unfitted finite element analysis. Unfitted finite element methods allow for the simulation of physical systems that are difficult if not impossible to simulate using classical finite element methods, and they also streamline the construction of design optimization technologies that optimize the geometry and material layout of an engineered system based on prescribed performance metrics. However, the computer implementation of an unfitted finite element method remains a challenging and time-consuming task even for domain experts. EXHUME overcomes this barrier by generating data structures that can be leveraged to transform classical finite element codes relying on element formation and assembly into unfitted finite element codes with little implementation effort.
Unfitted Finite Element Analysis, Immersogeometric Analysis, Extraction, Finite Element Software
45
1 - Tuesday, 10-11 amCarlos
Fernandez-Granda
NYU
Elements: Collaborative Research: Community-driven Environment of AI-powered Noise Reduction Services for Materials Discovery from Electron Microscopy Data
2103936
A key challenge for the proposed cyberinfrastructure is adapting denoising methods to heterogeneous datasets. Unsupervised methods have demonstrated impressive performance on synthetic benchmarks, but no metrics are available to perform evaluation in an unsupervised fashion. This is highly problematic for the proposed cyberinfrastructure, as our goal is to make it applicable in situations where ground-truth clean images are not available. We designed two novel metrics to evaluate performance: the unsupervised mean squared error (MSE) and the unsupervised peak signal-to-noise ratio (PSNR), which are asymptotically consistent estimators of the supervised MSE and PSNR. We evaluated the proposed metrics via controlled numerical experiments on transmission-electron-microscopy (TEM) images corrupted with synthetic Poisson noise and real-world TEM data directly relevant to the proposed cyberinfrastructure. Our results demonstrate that the proposed metrics enable unsupervised evaluation of denoising methods based exclusively on noisy data.
Evaluation metrics, deep learning, unsupervised data analysis, image processing
46
3 - Wednesday, 1030-1130RenatoJFigueiredo
University of Florida
Elements: EdgeVPN: Seamless Secure Virtual Networking for Edge and Fog Computing
2004441, 2004323
Edge computing encompasses techniques that can complement the widely adopted cloud computing model: it allows time-sensitive applications and services to leverage resources deployed physically near mobile and IoT devices, and can significantly reduce the requirements (and cost) of streaming large datasets to the cloud. Emerging applications such as ecological forecasting stand to benefit from the ability to integrate edge and cloud resources seamlessly to enable deployment of software across this continuum. Unlike cloud data centers, however, edge data centers have limited capacity, and are more numerous and distributed. While applications can benefit from execution across multiple providers, networking across them presents challenges because each provider is often siloed, where resources (e.g. containers and virtual machines) have private addresses, and middleboxes such as Network Address Translators (NATs) and firewalls constrain cross-provider connectivity. This poster presents an overview of the design and implementation of EdgeVPN, a virtual private network (VPN) with an open-source implementation that addresses these challenges. EdgeVPN is novel in how it: 1) spans across multiple edge providers while providing a unified virtual Ethernet address space, 2) enables seamless connectivity among participating nodes that join and leave the network dynamically, 3) enforces privacy in communications by tunneling over the public Internet, while supporting high-performance overlay links within edge/cloud data centers, and 4) supports unmodified applications and middleware platforms for application deployment and management, such as containers and their orchestration. EdgeVPN’s architecture combines a decentralized software-defined networking (SDN) control plane and a scalable structured peer-to-peer overlay of tunnels that form its datapath, while leveraging standards for authentication, messaging, and bootstrapping for NAT traversal. Its open-source software can be deployed as containers providing network function virtualization for arm64 and amd64 edge and cloud platforms. The poster also overviews application that leverage the virtual network, including event-driven ecological forecasts for lakes and reservoirs.
edge computing, virtual network, SDN
47
3 - Wednesday, 1030-1130RainerJFries
Texas A&M University
CSSI: Frameworks: X-Ion Collisions with a Statistically and Computationally Advanced Program Envelope (X-SCAPE)
2004571
JETSCAPE and XSCAPE: Flexible and Modular Simulation Frameworks for High Energy Nuclear Physics - Collisions of nuclei at high energies, as studied, e.g., at the Large Hadron Collider (LHC), involve very complex processes at multiple scales. The underlying physics is often not fully understood in terms of first principles. Over the years, a patchwork of simulations had been developed by the community to describe various isolated aspects of these collisions. The JETSCAPE framework has been introduced by us as a flexible, extendable framework to allow for comprehensive simulations of many key aspects of nuclear collisions. JETSCAPE houses established simulation codes as well as newly written simulation modules. The framework allows for easy exchange of the necessary data between various modules. Users can readily add further modules to the framework, or run simulations with the default setup. XSCAPE extends the capabilities of JETSCAPE to lower collision energies, smaller systems, and adds further capabilities, including the simulation of electron-nucleus collisions which will be studied at the new EIC facility. JETSCAPE and XSCAPE have been developed by a collaboration of experimental and theoretical physicists, computer scientists and statisticians, in response to a community in need of a new generation of modern simulation tools.
nuclear physics, Monte Carlo simulations, modular framework
48
2 - Tuesday, 5-7 pmMattiaGazzola
University of Illinois at Urbana-Champaign
Elastica - A software ecosystem for modeling, simulation, design, and control of soft, compliant, and heterogenous structures interacting with their environment
2209322
We present Elastica, an open source, ready-to-use ecosystem for the simulation, analysis, design, and control of mechanical structures made of millions of elastic fibers and operating in complex environments. A suite of HPC strategies enables tens-of-teraflops to petaflops-grade simulations, demonstrated here in variety of contexts, from the modeling and control of soft muscular arms to fiber based meta-materials and swimming.
Cosserat rods, HPC, soft robotics, multiphysics
49
3 - Wednesday, 1030-1130
Karthik Narayanan
Giriprasad
The Ohio State University
Elements: Data-Science Methods for Resource Allocation During Characterization of Dynamic Systems
2005012
Failure in materials often arises due to localized stress or strain concentrations, referred to as "stress hot spots". The likelihood of these hot spots forming is influenced by microstructural factors like local features, misorientations, among others, in relation to the applied load. Since these hot spots may lead to failure, it is advantageous to develop techniques to predict their formation based on the initial microstructural images. We describe a Convolutional Encoder-Decoder based approach to first model the elastic response in the form of stress fields simulated using Fourier transforms which can then be used to determine high-stress regions. The model is trained on local patches from synthetic microstructures generated using DREAM.3D and the results show that the Encoder-Decoder based approach can effectively learn spatial relations that may lead to hot spots.
Microstructure, Characterization, Machine Learning, Rare Events Prediction, Hot Spots
50
1 - Tuesday, 10-11 amAndreasWGoetz
University of California San Diego
Collaborative Research: Frameworks: Interoperable High-Performance Classical, Machine Learning and Quantum Free Energy Methods in AMBER
2209717
The goal of this CSSI Frameworks project is to develop accurate and efficient free energy software tools within a powerful multiscale modeling framework in the Amber program package. This framework will enable molecular simulations with new classes of potential energy functions that involve interoperability between reactive, machine learning, and quantum many-body potentials. These potentials provide enhanced accuracy, robustness, and predictive capability with respect to classical molecular mechanics force fields and can be customized to meet the needs of applications in fields as diverse as chemical catalysis, enzyme design, and drug discovery. Free energy capability will be enabled through a robust endpoint approach that leverages the performance of the GPU-enabled Amber molecular dynamics engine. The framework combines capabilities of the PuReMD, DeePMD, QUICK and Amber programs and employs industry-standard programming models to ensure performance portability and scalability on a diverse range of hardware architectures.
Free energy, force field, quantum mechanics, QM, machine learning, ML, Amber, interoperability, GPU, accelerator, portability, CUDA, HIP, SYCL, JAX, open-source software
51
2 - Tuesday, 5-7 pmJingGuo
University of Florida
CDS&E: Machine-Learning-Driven Methods for Multiobjective and Inverse Design of van-der-Waals-Material-Based Devices
2203625
Modeling, simulation, and design play a critical role essentially in every engineering field. For emerging nanoelectronic devices, computationally efficient and physically meaning models and design methods are essential for understanding experiments and translating early lab demonstrations into practical device technologies. Significant progresses have been made in demonstrating novel device structures and promising device performance based on atomically thin two-dimensional (2D) semiconductors and their heterojunctions (HJs). We have been developing physics-informed ML models in an embedded or hybrid manner for quantum transport device simulations, with consideration of improving efficiency, respecting device physics, and reducing amount of training data required. A multi-objective optimization method to systematically assess and comprehensively optimize 2D semiconductor and HJ devices is also being developed by simultaneously taking multiple technologically important device performance metrics into consideration.
Simulation and design, nanoscale devices, 2D semiconductors
52
3 - Wednesday, 1030-1130ThomasHaine
Johns Hopkins University
Collaborative Research: Framework: Data: Toward Exascale Community Ocean Circulation Modeling
1835640
The goals of the project are to deliver: 1. A benchmark, accurate global ocean circulation solution at O(1) km horizontal resolution. 2. Open-source software tools enabling efficient storage, indexing, and analysis of petabyte-scale ocean/atmosphere/climate datasets. 3. A Data Access Portal that deploys these tools, together with custom-built high-performance storage and computing resources, to provide scalable interactive analyses and visualizations of the benchmark solution to the climate and computer science communities. 4. Explorations of machine learning frameworks for automated identification of important events and data compression, with working prototypes for use in coupled climate models. 5. A foundation and a path to migrate computational oceanography to exascale. 6. A fully-functioning instance of the sort of cyberinfrastructure that will increasingly be needed by next generation simulation software in geosciences and beyond.
ocean circulation, ocean models, community data analysis, computational oceanography
53
1 - Tuesday, 10-11 amAmmarHHakim
Princeton University
Collaborative Research: Frameworks: A Software Ecosystem for Plasma Science and Space Weather Applications
2209471
From the plasma environment around planets and black-holes plasmas are ubiquitous in the visible universe. Computer simulations are critical to understand such complex systems. We are building an open-source, collaborative software ecosystem to provide advanced simulation capabilities to the whole plasma physics community. This ecosystem will consist of two major parts: a core simulation engine and a Plasma Science Virtual Laboratory to allow access to these advanced tools via an easy-to-use web interface. The simulation engine is built on top of the open-source Gkeyll framework. The Plasma Science Virtual Laboratory is designed to broaden researcher and educator access to plasma science and space weather analysis on high performance computing platforms. Our project aims to be community driven and welcomes enhancements via new solvers, improvements to existing code, updates to documentation via github pull-requests or issue creation.
plasma simulations, space-weather, astrophysics
54
1 - Tuesday, 10-11 amChadRHannaPenn State
An A+ Framework for Multimessenger Astrophysics Discoveries through Real-Time Gravitational Wave Detection
2103662
We present the status of our framework to support real-time gravitational wave detection with LIGO. Over the past two years we have been developing integrated software and services to perform LIGO data calibration, non-stationary noise identification and searches for merging black holes and neutron stars. In May 2023, LIGO began its fourth observing run and our framework has successfully enabled over 30 new gravitational wave discoveries with results disseminated in real-time to the public to support astronomical followup observations.
LIGO, Black Holes, Neutron Stars, Gravitational Waves
55
1 - Tuesday, 10-11 amXubinHe
Temple University
Collaborative Research: Elements: ProDM: Developing A Unified Progressive Data Management Library for Exascale Computational Science
2311756/2311757/2311758
This CSSI project aims to develop a sustainable framework ProDM that supports the progressive management of scientific data to facilitate its use in scientific applications. It will enable new scientific research and novel findings by providing a new way to manage and analyze data. ProDM is centered upon the unification of viable progressive representations and tailored development for in-situ and post-hoc analytic routines. In particular, it involves three key components: 1) a data engine to unify state-of-the-art progressive representations and to provide portable hardware support for accelerators as well as interoperative software interfaces to other libraries; 2) an in-situ engine to facilitate the use of progressive representations for in-situ data analytics, and 3)a post-hoc engine to efficiently access progressive data and improve the performance of data retrieval for post-hoc data analytics. ProDM will be deployed on campus-wide computing infrastructures and leadership systems for integration and evaluation with real-world scientific applications.
Progressive data management, in-situ data analytics, post-hoc data analytics, scalable cyberinfrastructure
56
2 - Tuesday, 5-7 pmJeffHeflin
Lehigh University
Elements: CRISPS: Cell-Centric Recursive Image Similarity Projection Searching
2246463
Materials scientists use microscopy modalities to determine the structure, order, and periodicity that affect properties. The core challenge is that only a fraction of microscopy data is published. Even if raw microscopy data was published and available, there are no good methodologies to conduct unstructured schema-free metadata searches or tools to search, recall, and compare similar microscopy images based on physics-aware features. CRISPS will be a full-stack software solution that seamlessly integrates three novel software concepts. 1) DataFed: a federated scientific database for collecting, collating, and searching scientific data and metadata. 2) Schema-Free Search: a tool for data-centric indexing and searching metadata without schemas. 3) Recursive Image Similarity Projections: a tool to interactively explore image similarity. CRISPS will harness the power of microscopy, scientific databases, machine learning, and graphical user interfaces (GUI) to accelerate the exploration of synthesis-structure-property relationships to design novel materials.
federated data, data exploration, image similarity, materials, microscopy
57
1 - Tuesday, 10-11 amHendrikHeinz
University of Colorado Boulder
Collaborative Research: Framework: Cyberloop for Accelerated Bionanomaterials Design
OAC-1931587
We highlight the development of an integrated cyberinfrastructure for molecular dynamics simulations of nanomaterials and biomaterials from atoms to micrometers in high accuracy and with high level automation, suitable for use by experts and significantly lowering the barrier for use by non-experts. We integrated the INTERFACE force field (IFF) and the IFF surface model database for a wide range of inorganic materials into CHARMM-GUI and OpenKIM, resulting in a new CHARMM-GUI Nanomaterial Modeler as an easy-to-use, web-based interactive platform for the simulation of inorganic, organic, and biological hybrid materials ("bionanomaterials"). OpenKIM was expanded for new functionality to facilitate performance comparisons of bonded force fields such as IFF and CHARMM in addition to nonbonded (reactive) force fields. The chemistry coverage of IFF was tripled, including metals, oxides, minerals, 2D materials, gases, and full compatibility with solvents, drugs, proteins, DNA, lipids, polymers, and unlimited materials combinations. The project has to-date supported 5 graduate students (including minority), 3 postdocs, and multiple undergraduate students.
materials simulation, biomolecular simulation, nanostructure, force fields, software, interatomic potential databases
58
1 - Tuesday, 10-11 amTimoHeister
Clemson University
Collaborative Research: Frameworks: Software: Future Proofing the Finite Element Library Deal.II -- Development and Community Building
2015848
Partial differential equations (PDEs) are used as mathematical models throughout the sciences and engineering. Their numerical solution is of great relevance in understanding, simulating, and optimizing natural, human, and engineered systems. In many applications, finite element methods (FEM) are the method of choice converting the PDE into finite dimensional, computationally solvable problems. The deal.II project is an open source C++ software library supporting the creation of finite element codes and an open community of users and developers. First, we report on our success in community building throughout the project and best practices we developed. Secondly, we report on recent developments of large changes regarding adaptive, massively multigrid solvers, improvements in performance portability, and more. Finally, we will discuss efforts in infrastructure, continuous integration, packaging, and more.
deal.II, finite elements, software, sustainability
59
2 - Tuesday, 5-7 pmMatthiasHeyden
Arizona State University
Elements: Streaming Molecular Dynamics Simulation Trajectories for Direct Analysis: Applications to Sub-Picosecond Dynamics in Microsecond Simulations
2311372
In this project, we enable streaming analysis of molecular dynamics (MD) simulation trajectories with direct data transfer from a running simulation to the analysis software via a TCP/IP socket application programming interface (API). Our proposed streaming technology provides an alternative to the post-simulation analysis of trajectory output files and eliminates bandwidth and disk storage limitations associated with the analysis of dynamical processes on multiple time scales. For example, streaming enables monitoring fast molecular vibrations during slow conformational changes in proteins, which holds key insights into the mechanisms of processes ranging from enzymatic catalysis to drug binding. The streaming analysis and a recently developed set of tools to study protein solvation maps (3D-2PT+) will be implemented in the open-source MDAnalysis library as core functions and MDAKit plugins, respectively, making them both available to thousands of active users of MDAnalysis and ensuring compatibility with a wide range of MD simulation software packages.
streaming, molecular dynamics
60
1 - Tuesday, 10-11 amJianHuang
University of Tennessee
Elements: Towards A Scalable Infrastructure for Archival and Reproducible Scientific Visualizations
2209767
Today's science revolves around leading edge datasets? data that scientists need to carefully analyze so that they can draw reliable scientific conclusions. The rate at which these leading-edge datasets are becoming larger and more complex is accelerating every day. In many ways, having access to a dataset does not equal to, or even come close to, having access to the insights in the dataset. This nuanced but crucial difference in accessibility creates a deep barrier to making scientific results reproducible. To this end, "Accessible Reproducible Research", published by Science in 2010, presented a system for reproducible research. A decade later, unfortunately, accessible reproducible research is still in its infancy. It turns out that this barrier is much more fundamental than previously believed, even though on the surface it seems solvable by investing resources and setting guidelines and policies. The real challenge is that the computing toolsets, the working environments, and the work processes of the original team of scientists are very difficult for a different team of scientists to recreate with precision. Such difficulty stems from the rapid speed at which computing technology is advancing; so that freezing a computing environment in a practical manner is nearly impossible. In addition, scientific intuition is difficult to codify, simply documenting a new idea is not enough to communicate what a scientist saw before pursuing that idea. From that respect, making accessible reproducible research a reality requires better methods and tools. In this project, the investigators will focus on the visualization step of data analysis, which is a central component of scientific discovery. This project?s aim is to develop an Archiving Infrastructure for Reproducible Interactive Visualization (AIRIV). Through this infrastructure, the investigators will demonstrate how visual explorations of large and complex data can be reliably captured, efficiently stored, easily shared, and freely reused by any user. This project will improve accessibility of reproducible research and promote the progress of science. For areas such as medicine and pharmaceutical research, this project will provide an unprecedented channel to accelerate translational research and advance the national health. This project will build upon research funded by a prior NSF CISE Research Infrastructure award. In that previous project, the investigators found a method to capture interactive user experience of visualization tools, and to share the captured experience without the need to share the original software or the original data. Furthermore, during the reuse of a captured experience, the user has freedom to explore beyond the exact sequence of how the previous user has used the tool with a method called Loom. In this new project to create AIRIV, the investigators will focus on web-based visualization dashboards, which represent the standard way for scientists around the world to interact with their data and derive insights. This project will first build a general AIRIV Javascript library that can be imported by any web browser-based application. Using the AIRIV library, developers of web-based visual dashboards can easily implement automatic generation of Loom objects into their dashboards. Developers will be able to instrument their applications to store new provenance information with Loom objects as well. The investigators will then conduct performance and scaling tests to understand the tradeoffs between hosting choices under settings of local, institutional clusters, and community shared data infrastructures. Operators of scientific facilities can use the findings to help science communities make informed choices as to where and how to host scientific visualization archives for better share-ability and cost efficiency. The investigators will also develop machine learning methods that can compare Loom objects and externalize commonalities and patterns in an entire archive of Loom objects. Such new methods will lead to creating a search by example functionality for AIRIV archives. For requirements collection, continuous improvement, and deployment testing, the investigators will engage the Mayo Clinic & Illinois Alliance, which serves as a framework for several technologies in healthcare, many of which center around the research and development of dashboard/analytical tools. We target two such analytics efforts, OmiX and KnowEnG, both of which are developed at National Center for Supercomputing Applications (NCSA). This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
Reproducible, Sharable, Accessible Research
61
1 - Tuesday, 10-11 amDavidHudak
Ohio Supercomputer Center
Frameworks: Software NSCI-Open OnDemand 2.0: Advancing Accessibility and Scalability for Computational Science through Leveraged Software Cyberinfrastructure
1835725
Developed by the Ohio Supercomputer Center and funded by NSF, Open OnDemand (openondemand.org) is an open-source portal that empowers students, researchers, and industry professionals with remote web access to supercomputers. Clients can manage files and jobs, create and share apps, and run GUI applications. From a client perspective, key features are that it requires zero installation (since it runs entirely in a browser), is easy to use (via a simple interface), and is compatible with any device (even a mobile phone). From a system administrator perspective, key features are that it provides a low barrier to entry for clients of all skill levels, is open source and has a large community behind it, and is configurable and flexible for user's unique needs. OOD is now in use at over 500 research computing service providers globally. These include public and private academic institutions, government agencies, non-profit organizations, and private industries.
Cyberinfrastructure, Science Gateways
62
1 - Tuesday, 10-11 amSarahEHuebner
University of Minnesota
Building the 21st Century Citizen Science Framework to Enable Scientific Discovery Across Disciplines
1835530
Our efforts toward improved Cyberinfrastructure for Sustained Scientific Innovation have evolved around three key objectives: 1) Combining Modes of Citizen Science building infrastructure that enabled integration of the Zooniverse platform with other citizen science websites, such as CitSci.org, Wildlife Insights, iNaturalist, and TrapTagger to make it easier for ecology project managers to connect with volunteers at both the collection and the classification levels. 2) Smart Task Assignment accelerates label gathering and maximizes the efficiency of volunteer effort, through tools built to allow dynamically combining machine learning predictions with volunteer annotations to determine the number of human annotations required based on accuracy and agreement as classification proceeds. 3) Data-as-Subject infrastructure empowered volunteers to engage directly with data in the Zooniverse subject viewer by presenting the combination of multi-modal data to volunteers and allowing them to manipulate different types of data to determine the appropriate classification.
citizen science, machine learning, human-in-the-loop
63
2 - Tuesday, 5-7 pmJamesWHurrell
Colorado State University
Collaborative Research: Frameworks: Community-Based Weather and Climate Simulation With a Global Storm-Resolving Model
2005137
The open-source Community Earth System Model (CESM) is both developed and applied to scientific problems by a large community of researchers. It is is managed by the National Center for Atmospheric Research. The CESM includes sub-models of the atmosphere, ocean, land surface, and sea ice. EarthWorks is a five-year project to develop a global coupled model of the atmosphere, ocean, and land surface, based on the CESM. The model uses a ~4 km global grid for all three components. The atmosphere and ocean sub-models use closely related dynamical cores for the atmosphere and ocean that have been developed at NCAR and at the Los Alamos National Laboratory, respectively, and are well-suited for applications that require very high spatial resolution. All components of our model will use the same very high-resolution grid. The use of kilometer-scale resolution makes it possible to eliminate the particularly troublesome parameterization of deep cumulus convection (i.e., thunderstorms).
Earth system model; global storm resolving; weather; climate
64
2 - Tuesday, 5-7 pmMatthiasIhme
Stanford University
Enabling High-fidelity Turbulent Reacting-flow Simulations through Advanced Algorithms, Code Acceleration, and High-order Methods for Extreme-scale Computing
1909379
Accurate numerical simulations of turbulent reacting flows are of practical importance for several applications, including gas turbines and internal-combustion engines for power generation and transportation, the risk mitigation associate with reactor safety, and for scientific discovery of novel energy-conversion strategies. However, commonly employed legacy codes employ simplified approximations of the governing equations that describe these reacting flows. As such, they exhibit deficiencies in accurately representing the underlying physical processes, involving turbulent mixing, localized reaction zones, and heat release. So-called discontinuous Galerkin (DG) methods have been identified as a promising alternative. These methods are characterized by utilizing an integral formulation that provides a general numerical discretization of the governing equations with significantly improved fidelity. Other advantages are the flexibility in representation complex physical processes and the excellent performance on high-performance computing systems. While the potential of these DG-methods has been recognized, major roadblocks to adoption include the lack of cyberinfrastructure (CI) for scientific discovery and engineering analysis as well as the need for innovative programming techniques to enable scalable simulations on distributed machines with heterogeneous processors and complex memory hierarchies. This project addresses these research challenges and develops novel numerical methods and advanced programming paradigms for high-performance simulations of turbulent reacting flows. This project thus serves the national interest, as stated by NSF's mission: to promote the progress of science and to secure the national defense.
High-order methods; Discontinuous Galerkin
65
3 - Wednesday, 1030-1130RyanJacobs
University of Wisconsin-Madison
Collaborative Research: Framework: Machine Learning Materials Innovation Infrastructure
1931298
The goal of this work is to support the development and use of machine learning models in materials science and engineering through a new ecosystem centered on accessible formatted data, containerized models, and tools for efficient model fitting with robust quality assurance. We have developed the Foundry-ML cloud resource which hosts formatted datasets for frictionless use in machine learning models and containerized models that can be run with just a few lines of python. We have also developed the MAterials Simulation Toolkit for Machine Learning (MAST-ML), an open source, python-based software package that allows users to easily automate each stage of the supervised ML pipeline. MAST-ML includes predicted uncertainty quantification (i.e., error bars) and domain of model applicability, and interfaces with Foundry-ML to facilitate cloud-based hosting of fitted models. We are developing multiple machine learning models to facilitate materials property prediction and electron microscopy analysis and hosting them in Foundry-ML.
machine learning, materials science, model hosting, materials data, cloud infrastructure
66
3 - Wednesday, 1030-1130RajeshKalyanam
Purdue University
Elements: Data: U-Cube: A Cyberinfrastructure for Unified and Ubiquitous Urban Canopy Parameterization
1835739
Urban canopy parameters (UCPs) can be used in model simulations to study the health and behavior of a city, determine the ability to sustain a growing population, and study potential impacts of extreme weather events. The ability to identify and compute urban canopy parameters has been a missing element in city models; this project develops that capability for use in city design and analysis, integrating weather models and remote sensing data to infer a 3D model of cities of various sizes. This project develops cyberinfrastructure which uses a novel inverse modeling approach incorporating satellite images, social science and urban zonal data, to infer a 3D model of a city from which UCPs could be derived for use in simulation models. The focus is on weather modeling, urban parameterization and a desire to better understand sustainable urbanization. The main cyberinfrastructure products are 3D urban models and UCP values for urban locations.
urban,weather,modeling,CI
67
2 - Tuesday, 5-7 pmNagarajan Kandasamy
Drexel University
Elements: Software Infrastructure for Programming and Architectural Exploration of Neuromorphic Computing Systems
2209745
Neuromorphic computing systems, which mimic biological neurons and synapses, can implement machine-learning (ML) tasks in power-efficient fashion. Major challenges for neuromorphic computing lie in its adoption by users and from a system developer's perspective, to cope with faster time-to-market pressure for new chip designs. Our project is developing a software infrastructure called NeuroXplorer, which helps both end-users as well as developers of neuromorphic systems. It allows for ML tasks to be mapped onto neuromorphic architectures in the most efficient way possible; and provides analysis and synthesis tools to explore new chip designs to meet the needs of machine-learning workloads. NeuroXplorer supports code generation for neuromorphic chips from a high-level specification of the ML task; provides synthesis tools to map ML tasks on novel neuromorphic architectures built using FPGAs; and supports hardware/software design-space exploration of new architectures. NeuroXplorer is distributed under an open-source license to promote adoption of neuromorphic computing.
Neuromorphic computing, software infrastructure, FPGA, hardware/software codesign
68
1 - Tuesday, 10-11 amMahmutTKandemirPenn State
CSSI Frameworks: Re-engineering Galaxy for Performance, Scalability and Energy Efficiency
OAC-1931531
In our project, we currently pursue three complementary directions. First, we enhance Galaxy Runtime System with accelerator support including GPUs, FPGAs, and custom ASIC accelerators. For example, one item that we are currently working on is Adaptive Band Event Alignment (ABEA), which is a key algorithmic component in Nanopolish call-methylation module on FPGA. Second, we port select applications and tools to this enhanced runtime environment. GPU-accelerated genome alignments tools, such as SegAlign, can benefit from using multiple GPUs. However, our experiments show that these genome alignment tools do not fully utilize powerful data-center grade GPUs, such as NVIDIA’s A100 GPU. We are researching using new hardware features exclusive to NVIDIA’s data-center grade GPUs, such as Multi Instance GPU (MIG) together with Multi-Process Service (MPS) to increase both GPU and CPU utilization by running multiple GPU kernels simultaneously. Third, we overhaul the Galaxy storage system to make use of emerging parallel and distributed file systems as well as cloud services (e.g., serverless computing).
Bioinformatics, HPC, storage systems, cloud computing
69
2 - Tuesday, 5-7 pmPeterKasson
University of Virginia
SCALE-MS - Scalable Adaptive Large Ensembles of Molecular Simulations
1835780
The SCALE-MS project seeks to enable high-level algorithms for molecular simulation and simplify their execution on high-performance cyberinfrastructure. Increasingly, molecular simulation problems are best formulated not as a single simulation or even a static pipeline but as ensembles of simulations where the logic may adapt to intermediate results. The SCALE-MS project has created high-level APIs for specifying this logic and links API execution to a capable runtime framework leveraging the RADICAL stack that runs on most NSF high-performance computing clusters. We present both an initial implementation of this adaptive ensemble simulation API and lessons learned.
ensemble simulations, molecular dynamics simulations, adaptive workflows
70
2 - Tuesday, 5-7 pmKerkFKee
Texas Tech University
OAC Core: Small: Collaborative Research: Conversational Agents for Supporting Sustainable Implementation and Systemic Diffusion of Cyberinfrastructure and Science Gateways
OAC-2042054 and OAC-2042055
The advent of machine learning (ML) has led to the widespread adoption in developing task-oriented dialog systems for scientific applications (e.g., science gateways) where voluminous information sources are retrieved and curated based on domain-user intents. Yet, there still exists a challenge in designing chatbot dialog systems that measures and enables widespread diffusion among scientific communities. In this paper, we develop the Vidura Advisor Framework (VAF) that designs ML-based dialog systems for information retrieval (IR) tasks and quantifies its utility based on human performance in various science gateway applications. Herein, we address the socio-technical challenge of designing dialog systems by utilizing domain expert feedback to apply our framework to multiple dialog system architectures, which includes a novel document ranking approach using the monoT5 architecture. VAF also features a utility measurement framework that assesses human performance based on a set of application utility metrics. We perform a three-fold experiment to demonstrate the utility of dialog systems integrated with science gateways based on IR performance, application utility, and perceived adoption on an exemplar science gateway, viz. KnowCOVID-19. Experimental results against state-of-the-art neural IR methods on benchmark datasets demonstrate the effectiveness of our approach. Furthermore, we observe that measuring application utility and human performance translates to the increase in perceived adoption among scientific communities.
Science gateways, chatbot dialog system, user technology adoption
71
3 - Wednesday, 1030-1130WolfgangEKerzendorfTARDIS RT
Elements: The TARDIS radiative transfer framework - A modeling toolkit for transients
2311323
TARDIS is an open-source Monte Carlo radiative-transfer spectral synthesis code for 1D models of supernova ejecta. It is designed for rapid spectral modelling of astrophysical transients. It is developed and maintained by a multi-disciplinary team including software engineers, computer scientists, statisticians, and astrophysicists.
Open Science, Physics, Python
72
2 - Tuesday, 5-7 pmMaratKhairoutdinov
Institute for Advanced Computational Science, Stony Brook University
Collaborative Research: Towards Better Understanding of the Climate System Using a Global Storm-Resolving Model
AGS-2218827
Title: Global System for Atmospheric Modeling: Spanning Urban to Global Scales. The System for Atmospheric Modeling (SAM) has been a staple in the cloud-resolving modeling community. Recent advancements have yielded the global storm-resolving model (GSRM), termed global SAM (gSAM). Notably, gSAM not only operates on global domains at resolutions of a few kilometers but also excels in local regional domains with open boundary conditions. A significant part of its application is the ability of gSAM to function as a large eddy simulation model and an urban-climate model. Recently, gSAM has been enhanced to simulate atmospheric flow around buildings, making it especially attuned to urban settings. Further enhancing its urban modeling prowess, gSAM's atmospheric radiative transfer has been made building-aware. It accounts for building-induced factors such as shadowing, reflection, and sidewall emissions, effectively capturing shortwave and longwave surface fluxes. This means that gSAM stands out in its capability to simulate atmospheric motion across all scales: from intricate urban scenarios demanding resolutions of mere meters to global storm-resolving simulations requiring kilometer-scale resolutions. Despite these multifaceted enhancements, gSAM continues to exhibit high computational efficiency on massively parallel computing systems. As part of our ongoing commitment to community engagement and advancement, we plan to make gSAM openly accessible, targeting both cloud modeling and Earth System research domains. By housing gSAM on GitHub, we aim to foster an environment of collaborative improvement.
Large-eddy simuation, Global System-resolving model, urban modeling, massively parallel model
73
2 - Tuesday, 5-7 pmMohammadKhalid Jawed
University of California, Los Angeles
Collaborative Research: Elements: Discrete Simulation of Flexible Structures and Soft Robots
2209782, 2209784, 2209783
The project objective is implementation and deployment of a simulation framework for flexible structures and soft robots. Faster than real-time (but physically accurate) simulation of complex systems holds the key to a number of unsolved challenges across a broad spectrum of disciplines ranging from robotics to micro/nano engineering. Such systems are typically comprised of slender elements (e.g., rods and shells) and undergo geometrically nonlinear deformation even under moderate loading. Moreover, they often experience friction, hydrodynamics, and contact. As a result, high-fidelity and fast simulation of such systems is challenging to achieve. Interestingly, the computer graphics community was facing similar challenges for animation of hair and clothes in movies and video games. This led to the growth of a new field - Discrete Differential Geometry (DDG) - that helped significantly reduce the computation time. The proposed project will lead to a general-purpose simulation tool for a wide variety of structures that can be used by users with no knowledge of DDG.
Simulation, robotics, computer graphics
74
2 - Tuesday, 5-7 pmLatifurKhan
University of Texas at Dallas
Frameworks: Infrastructure For Political And Social Event Data using Machine Learning
2311142
This project intends to revolutionize computerized data extraction of conflict event data at a global scale. Currently, most conflict event data are expensively coded by humans from news reports. This project uses recent advances in artificial intelligence and large language models to address this fundamental issue for conflict research. It builds on earlier NSF efforts that created a publicly available large language model to study inter- and intra-state conflict, called ConfliBERT. This project expands the ConfliBERT model to multilingual settings, including Arabic and Spanish. In addition to the multilingual models, the project will create network data for individuals, groups, locations, and events. As the project's cyberinfrastructure develops, we will foster a research community through training, education, and outreach with groups at local, national, and international levels, including academics and government. This will help researchers and policymakers better understand the conflict in foreign locations with high accuracy and in real time.
Information Extraction, Transformer Language Model, Event Coding
75
2 - Tuesday, 5-7 pmLawrenceEKidder
Cornell University
Collaborative Research: Elements: A task-based code for multiphysics problems in astrophysics at exascale
OAC-2209655
We are developing an open-source community code for multi-scale, multi-physics problems in astrophysics and gravitational physics. The code uses transformative algorithms to reach exascale. The techniques can be applied across discipline boundaries in fluid dynamics, geoscience, plasma physics, nuclear physics and engineering. The development of this new code has been driven by the deployment of gravitational wave detectors such as LIGO. The code is designed to scale to over a million cores using two key algorithmic innovations: task-based parallelism and a hybrid discontinuous Galerkin - finite difference subcell method. These innovations promise revolutionary impact in other fields relying on numerical solution of partial differential equations at exascale.
astrophysics PDEs
76
3 - Wednesday, 1030-1130HyesoonKimGeorgia Tech
Elements:Open-source hardware and software evaluation system for UAV
2103951
This project introduces an innovative approach to multi-drone flight scenarios within the Cyber-Physical Systems (CPS) simulation infrastructure. The framework includes a custom-designed reconfigurable UAV hardware and software stack, enabling the evaluation of diverse computing and communication accelerations. The framework offers comprehensive simulation capabilities, incorporating model-based and learning-based algorithms, coupled with pre-hardware prototyping validation on various drone platforms. This year's achievements encompass the demonstration of multi-task federated reinforcement learning and the integration of FPGA-based localizations. The project aims to contribute to the research community by developing an open-source framework that directly links Flight controller hardware to a physics simulation, UniUAVSim. The extended version of the framework is accessible through Docker, featuring all dependencies and toolchains in the DockerFile. Through these endeavors, the project serves as a notable advancement in advancing multi-drone flight scenarios and enhancing Cyber-Physical Systems research.
Drone simulation, FPGA, CPS
77
2 - Tuesday, 5-7 pmMelissaKline Struhl
Massachusetts Institute of Technology
Cyberinfrastructure for Remote Data Collection with Children
2209756
Since 2019, the NSF-funded Lookit platform has served developmental psychologists as a platform for remote, unmoderated (on-demand) studies that securely capture webcam video and manages family-researcher communication. At the start of the COVID pandemic, the broader infant/child research community responded to the shutdown of all in-person labs with Children Helping Science, a "bulletin board" site hosting advertisements for remote studies including video-conference sessions (e.g. Zoom), and surveys (e.g. Qualtrics, RedCap). In 2023, these platforms merged, running on existing Lookit architecture and advertised to participating families as Children Helping Science. The expanded infrastructure now accommodates external (video conference or unmoderated/self-paced) studies, allowing families to access all types of remote studies from a single portal. Researchers benefit from targeted recruitment, secure access to demographic data, and data integration tools across a range of study infrastructures. We explore merger's successes and challenges across technical, regulatory, and community perspectives.
Babies, children, secure data, community management, data management
78
1 - Tuesday, 10-11 amAndreasKloeckner
University of Illinois
SHF: Small: Collaborative Research: Transform-to-perform: languages, algorithms, and solvers for nonlocal operators
SHF-1911019, SHF-1909176
Non-local operators such as layer potentials have long played an important role in mathematical modeling across many disciplines. New work on fractional derivatives, which are also non-local, has rapidly expanded the number of operators receiving attention. Modeling efforts increasingly include such non-local operators interacting with classical partial differential operators. Fast multipole methods (FMM) are known to efficiently evaluate layer potentials, but have not been demonstrated for many other classes of non-local operator including fractional derivatives. In addition, robust software systems for these currently lag far behind the relatively mature libraries and domain-specific languages for finite element discretizations of PDE. This project extend UFL, a well-known language for finite element discretizations of PDE, to include a robust range of non-local operators. An extended compiler will generate performant code combining FMM with effective finite element algorithms.
nonlocal operator, fast multipole method, fast algorithm, numerical modeling, DSL
79
2 - Tuesday, 5-7 pmAndreasKloeckner
University of Illinois
Elements: Transformation-Based High-Performance Computing in Dynamic Languages
OAC-1931577.
Multidimensional arrays are a foundational data structure for much of scientific computing, with applications ranging from weather prediction to deep learning, from image processing to computational neuroscience. Even the efficient execution of matrix-matrix multiplication poses considerable technical challenges. Through a polyhedrally-based program transformation tool, we will provide separation between mathematical intent and the technical challenges of program optimization, allowing each task to be performed by a domain expert. In our project, we are developing means for more efficient on-chip communication, code generation for prefix sums, reuse and abstraction in program transformation, increasing the ease of use in transformation discovery and performance analysis, and for expressing array computations in user programs. We validate the proposed techniques through a challenging application from numerical analysis with broad applicability.
tensor, program transformation, polyhedral, array, HPC
80
3 - Wednesday, 1030-1130RichardJKnepper
Cornell University
CSSI: Frameworks: Large Scale Atmospheric Research Using an Integrated WRF Modeling, Visualization, and Verification Framework (I-WRF)
OAC-2209711
In this poster, Cornell and NCAR introduce I-WRF, a CSSI framework implementation that will support the development and deployment of an application container suite that integrates the Weather Research and Forecasting Model (WRF), Model Evaluation Tools (MET), and METPlus. I-WRF will lower the bar for multi-disciplinary researchers who wish to use WRF in parallel, ranging from desktops to cloud to supercomputers Users will not have to configure and deploy individual elements separately. Containers will include the entire environment and recipes to facilitate wider use. Use case scientists will use I-WRF to scale research studies on the impact of climate change on wind and solar power generation and urban air quality. In addition, the ability to run multi-node simulations on desktops will overcome current obstacles in training students to use WRF at NCAR and in university course curricula.
Applied computing, Earth and atmospheric sciences, Computing methodologies, Modeling and simulation, Information systems, Computing platform
81
3 - Wednesday, 1030-1130ChristopherJ.Knight
The University of Chicago
Frameworks: Data-Driven Software Infrastructure for Next-Generation Molecular Simulations
2311260
This research program is dedicated to creating a state-of-the-art software infrastructure for advanced computer simulations. Utilizing data-driven many-body potentials, it is optimized for high-performance computing (HPC). The enhancement of MB-Fit software and the introduction of a GPU-accelerated MBX version aim to revolutionize molecular simulations. In collaboration with existing molecular dynamics tools, the program addresses previously insurmountable challenges in molecular sciences. This initiative, at the nexus of software engineering and molecular sciences, champions next-generation simulations in various fields, from chemistry to biophysics. Its adaptability to emerging HPC trends promises unparalleled realism in computational studies, unveiling critical structure-property relationships and catalyzing innovations in materials, energy, and biomaterials. The program's modular design encourages community collaboration. Additionally, it emphasizes training through workshops, hands-on sessions, and prioritizes molecular science education for undergraduates, especially those from underrepresented backgrounds, preparing them for tech-driven careers.
molecular simulations, high-performance computing, data-driven algorithms, many-body interactions, open-source software
82
3 - Wednesday, 1030-1130MichellePKuchera
Davidson College
Elements: Portable Machine Learning Models for Experimental Nuclear Physics
2311263
Our project focuses on providing pre-trained machine learning (ML) models to the experimental nuclear physics community. When one surveys the myriad ways in which ML is used in the community, there is a tendency to construct bespoke models from scratch for each new application. This requires considerable technical expertise and access to large volumes of human-annotated data. Some recent efforts in ML have focused on minimizing the reliance on large-scale labeling of data. This project centers on building models using these techniques for various nuclear physics experiments at FRIB that will then be released to the scientific community. These models can be adapted through a process of offline tuning by end-users for a variety of downstream applications. The models are developed using unlabeled data from three particle detector systems. The models are evaluated on key analysis and fitting tasks that have been identified by our collaborators at FRIB.
machine learning, nuclear physics
83
2 - Tuesday, 5-7 pmKrishnaKumarUT Austin
Elements: Cognitasium - Enabling Data-Driven Discoveries in Natural Hazards Engineering
2103937
Identifying appropriate data sources is critical for data-driven discovery. DesignSafe's index search often fails to provide the necessary context. For example, searching for specific soil testing datasets showing liquefaction fails to yield results. We develop a reasoning engine using Large Language Models to enable semantic search and graph prediction on 200 soil testing datasets. However, constructing a semantic search engine requires domain experts to identify appropriate metadata and CI skills to develop metadata scrapers and KG engines. Challenges include missing headers, inconsistent data models, mixed data in Excel, and inconsistent variable names. Developing an ETL framework with pure coding cannot scale to petabytes of data. Our hybrid human-LLM approach for metadata extraction overcomes zero-shot LLM issues like hallucinations and inconsistencies. We demonstrate LLM's capability in constructing KGs to enable data-driven discovery frontiers.
LLM, Knowledge Graph, DesignSafe, Data-driven discoveries
84
3 - Wednesday, 1030-1130RatneshKumar
Iowa State University
Elements: Agricultural Cyber-infrastructure support for Field and Grid Modeling, and Runtime Decision-Making
2004766
“MISSION: Model-based In-Season Sensor-driven Scheduling of Irrigation Or/and Nutrients” framework is presented here to maximize farm profit and reduce environmental impact by efficiently simulating a field-calibrated agriculture model considering (i) current plant and soil conditions, (ii) history of weather and agriculture inputs, and (iii) forecasted weather from Global Forecast System (GFS). Using data from USDA LIRF experimental farm in Greeley, Colorado, for model-calibration and decision-making, the framework's real-time in-season recommendations were computed that generated $1926/Ha profit, 41% more than traditional expert knowledge-based applications, and merely 5% less than the scenario where the weather was taken to be known a-priori from the historical record. We also report the very first cloud-hosting of a popular agriculture model RZWQM on MyGeoHub cyberinfrastructure, where we also made a cloud-based tool for model calibration and the proposed decision-making framework. These cloud resources help accelerate the optimization process with a simplified workflow for end users.
Agriculture Modeling RZWQM, Precision Agriculture, Model-predictive decision-making, Cloud-Hosting
85
3 - Wednesday, 1030-1130PabloLaguna
The University of Texas at Austin
Collaborative Research: Frameworks: The Einstein Toolkit Ecosystem: Enabling fundamental research in the era of multi-messenger astrophysics
2114582, 2004157, 2004044, 2004311, 2004879, 2003893
Results and status of the the Einstein Toolkit: an open-source, community-driven cyberinfrastructure ecosystem that provides key computational tools to trans- form and support ground-breaking research in computational astrophysics, gravitational physics, and fundamental science.
Numerical Relativity, computational astrophysics
86
3 - Wednesday, 1030-1130DavidLange
Princeton University
Elements: C++ as a service - rapid software development and dynamic interoperability with Python and beyond
1931408
A key enabler of innovation and discovery for many scientific researchers is the ability to explore data and express ideas quickly as software prototypes. Tools and techniques that reduce the "time to insight" are essential to the productivity of researchers. Conversely, performance-focused languages, such as C++, are a critical infrastructure component for many scientific fields that have either large computing challenges or the need for low latency for results. The productivity of data scientists is increased by an easy to use dynamic programming and development environment, together with a fully featured interoperability layer. The CaaS project provides a dynamic C++ execution environment and enables runtime language interoperability between C++ and other languages, such as Python, through a native-like, dynamic environment. We will demonstrate project outcomes including the use of this environment to enable dual-language notebooks.
llvm, interoperability, interactivity
87
2 - Tuesday, 5-7 pmJulienLangou
University of Colorado Denver
Collaborative Research: Frameworks: Basic ALgebra LIbraries for Sustainable Technology with Interdisciplinary Collaboration (BALLISTIC)
2004850
The Linear Algebra PACKage (LAPACK) is a community standard for dense linear algebra and has been adopted and supported by a large community of users, computing centers, and high-performance computing (HPC) vendors. The <T>LAPACK library is one proposal to close the gap between LAPACK and the new and emerging computing platforms. It uses C++ templates to provide precision-neutral algorithms, i.e., which work in single, double, half, and multiprecision types, and, in some cases, allow for mixed-precision. The notion of matrix is also abstracted which enables interoperability with other existing framework (such as, e.g., Eigen) In the long term, we hope that some of the framework of LAPACK could be used to develop new implementations of state-of-the-art numerical linear algebra libraries. In this talk, we will present <T>LAPACK, its design, features, and some examples of usage. We will also present: (1) new progress in our efforts to propagate and test for consistent exception handling in LAPACK, (2) new features recently integrated in LAPACK, and (3) recent featured research results.
Software, Numerical Linear Algebra, C++, community, sustainability, open source, LAPACK, BLAS
88
1 - Tuesday, 10-11 amChristopherLeague
Long Island University
Bifrost: A CPU/GPU Pipeline Framework for High Throughput Data Acquisition and Analysis
2103771
Bifrost is a CPU/GPU framework for developing high-throughput data acquisition and analysis systems. This project aims to strengthen its foundations, improve its usability, and broaden its applicability. We have increased the data rates that Bifrost is capable of handling and simplified the process for configuring and installing the software. We streamlined its dependencies, improved its interoperability with other libraries, and updated a series of tutorials and documentation. As for applications, we have continued to develop Bifrost-based pipelines in radioastronomy, including beam-former and imaging components for the Caltech OVRO-LWA station. Bifrost will also be critical in the operation of the new LWA station at the North Arm of the VLA. We have also made substantial progress in employing Bifrost for interferometric synthetic-aperture radar (InSAR), to study ground subsidence. Converting one research pipeline to use GPU computation via Bifrost reduced its runtime from 92 hours to 17 minutes.
GPU, Python, radioastronomy, data acquisition, interferometry, synthetic aperture radar, big data analysis
89
2 - Tuesday, 5-7 pmJonghyunLee
University of Hawaii at Manoa
Elements: ALE-AMR Framework and the PISALE Codebase
2005259
The solution of partial differential equations (PDEs) on modern HPC platforms is essential to the continued success of research and modeling for a wide variety of areas, especially groundwater flow and transport modeling in Pacific islands. The project implements an innovative combination of advanced mathematical techniques of Arbitrary Lagrangian Eulerian (ALE) methods with Adaptive Mesh Refinement (AMR), including parallel software tools to dynamically adapt the grids and special Lagrangian-flow methods that allow for the simulation of complex regional groundwater flow with dynamic freshwater and seawater interaction in heterogeneous volcanic rocks. The PISALE (Pacific Island Structured-AMR with ALE) software aims at a publicly available sustainable branch of the software and will provide accurate and scalable simulations of complex groundwater flow processes in the Hawaiian islands. The status of our ongoing project will be presented with student involvement, course development, and future plans. Other applications of PISALE are discussed as well.
PISALE, Arbitrary Lagrangian Eulerian, Adaptive Mesh Refinement, Density-driven flow modeling, Coastal Aquifer, Groundwater
90
2 - Tuesday, 5-7 pmJasonLeigh
University of Hawaii
Collaborative Research: CSSI Frameworks: SAGE3: Smart Amplified Group Environment for Harnessing the Data Revolution
2004014 | 2003800 | 2003387
The Scalable Adaptive Graphics Environment (SAGE) and the Scalable Amplified Group Environment (SAGE2) are software that enable scientists, researchers and students to collaborate with their colleagues and their data in front of scalable tiled display walls, sharing information and digital media data, particularly large-scale visualizations and animations, to make discoveries, and come to decisions with greater speed, accuracy, comprehensiveness and confidence. This approach has successfully supported collaborations for over 5000 users across 600 institutions equipped with tiled display walls, encompassing 21 disciplines. SAGE3 (sage3.sagecommons.org) represents our comprehensive redesign and re-implementation of its predecessors, to endeavor to endow its users with two fundamental and intricate functionalities: 1. the seamless exchange of substantial varieties and quantities of content across both expansive display walls as well as non-display walls (such as collaborative laptops/computers); and 2. the augmentation of data-intensive collaboration leveraging artificial intelligence (AI).
visualization, collaboration, artificial intelligence
91
3 - Wednesday, 1030-1130SanjivaK.Lele
Stanford University
Elements: AMR-H: Adaptive Multi-resolution High-order Solver for Multiphase Compressible Flows on Heterogenous Platforms
OAC-2103509
The high-order (compact scheme) finite-difference based simulation framework was extended to applications involving shock-turbulent-boundary-layer interactions on airfoils, and turbulent flows of non-ideal gases near their critical conditions. All research outcomes are shared via technical papers and conference presentations. Legion-based AMR framework is under development with focus on the architecture design for efficient future optimization of communication and workload balance. Encouraged by the warm response to our success with compact scheme based flow solver on Summit during year 1 of the project, we have steered our efforts towards an open release of a large-scale parallel linear solver for compact numerical schemes. This extended work, in collaboration with LLNL and NVIDIA in open-science domain, aims to design a highly optimized algorithm and open source implementation for solving the cyclic penta-diagonal linear system arising from higher-order compact numerical schemes on multiple GPUs. The software and its documentation will be released on GitHub.
CFD, Turbulence simulations, AMR, Exascale
92
1 - Tuesday, 10-11 amGerardLemson
Johns Hopkins University
Sustainability: Open SciServer: A Sustainable Data-Driven Science Platform
2311791
The Open SciServer project implements a transition to sustainability for the SciServer collaborative data-driven science platform. SciServer is a high-impact, highly successful DIBB (data infrastructure building block) with a well-developed existing code base; an established user community; and demonstrated impact on scientific discovery, research, and education. SciServer has grown from a platform for transformational impact in astronomy to one that creates significant impact in multiple science domains including – but not limited to - materials science and engineering; turbulence; oceanography; and precision medicine and genomics. To facilitate the transition to sustainability, we will work with the NSF Science Gateways Community Institute and deepen our interaction with the community. The transition plan will migrate the SciServer code base from closed to open source enabling expansion of the user community to provide continued evolution of the platform as technologies and science drivers grow. The second major thrust of the transition plan is the creation of simple, rapid cloud deployment and management options to suit the broad range of current and future SciServer users. The SciServer transition to sustainability will also generalize its educational tools to allow all users to integrate SciServer with curricular efforts to accelerate translation of education to research and workforce development in data-intensive science domains.
none
93
2 - Tuesday, 5-7 pmYongsheng Leng
George Washington University
CDS&E: Computational Simulation and Cyber Software Development for Nanoscale Friction
1953171
Friction exists at the contact between two sliding surfaces. It occurs throughout nature -- from objects the width of a single hair (1000 nanometers) to continental scale (earthquakes) and even to particles in outer space. This project developed a new computational modeling and simulation tool to enable deeper understanding of nanoscale friction dynamics to enable more efficient advanced manufacturing processes and high-performance nanoscale machines and systems -- from drug delivery devices to nanorobots. The project will train graduate and undergraduate students in algorithm and code development and will broaden the participation of underrepresented groups by developing their research skills for careers in computational and data-enabled science and engineering.
Atomic scale friction, molecular dynamics, open source CI
94
2 - Tuesday, 5-7 pmAlexanderLex
University of Utah
Collaborative Research: Framework: Software: HDR: Reproducible Visual Analysis of Multivariate Networks with MultiNet
1835904
MultiNet is an open source network visualization platform that empowers users to seamlessly upload data, generate interactive network visualizations, and facilitate collaborative exploration. The platform provides two general network visualizations — a dynamic node link diagram (MultiLink) and an interactive adjacency matrix (MultiMatrix). Additionally, MultiNet includes the UpSet visualization, a popular method for visualizing set (or hypergraph) data. The versatility of MultiNet transcends disciplinary boundaries: we have worked with collaborators in diverse domains, including social network analysis, retinal connectomics, and geology. Additionally, we have created bespoke applications to solve specific needs within the evolutionary biology field using Multinet as the data management platform. MultiNet stands as a dynamic and versatile platform reshaping data visualization, fostering collaboration, and driving profound insights across a wide spectrum of scientific domains.
Networks visualization, Visualization, Software, Infrastructure, Data Management
95
1 - Tuesday, 10-11 amDongLiUC Merced
Collaborative Research: Elements: SciMem: Enabling High Performance Multi- Scale Simulation on Big Memory Platforms
2104116
The project will enable high performance multiscale simulations on big memory platforms through more efficient utilization of large and heterogeneous memory machines. Specifically, it will replace computations with pre-computed and stored in memory data on a heterogeneous computing systems. The developed tool named SciMem will be integrated and tested with the popular parallel molecular dynamics (MD) simulator, LAMMPS (Large-scale Atomic/Molecular Massively Parallel Simulator). The developed improvements in the use of computational resources will allow more accurate models of complex physical phenomena to be carried out on the emerging hardware systems. SciMem aims to bring a 10x performance improvement for certain larger-scale multi-scale simulations widely applied in the fields of computational chemistry and material science, e.g., quantum mechanical/molecular mechanical-based MD simulation of catalysis.
multiscale simulation, molecular dynamics, big memory, heterogeneous computing
96
3 - Wednesday, 1030-1130XuLiang
University of Pittsburgh
Collaborative Research: Frameworks: Building a collaboration infrastructure: CyberWater2—A sustainable data/model integration framework
2209833, 2209835, 2209834
This CyberWater2 project builds on CyberWater and significantly boost it into a new level of capability and effectiveness for broad collaboration. CyberWater2 aims to (1) facilitate two-way couplings across heterogeneous computing platforms, disciplines, and organizations; (2) automate complex model calibration and facilitate data assimilation processes applicable to various models; (3) provide web service framework; (4) enable sustainable online data access by automatically adapting data agents to any changes made by data providers; and (5) support intelligent site recommendation for HPC/Cloud access on demand. Transitioning from CyberWater to CyberWater2, we have integrated Docker platform to facilitate scientific modeling and collaborations across disciplines (demonstrated with WRF-Hydro), upgraded the adopted VisTrails (including Qt, VTK, etc.) from its original python v2.7 to v3.9, and developed an asynchronous workflow mechanism and HPC site recommendation to maximize the capacity of on-demand HPC/Cloud access and optimize the performance/cost ratio.
Two-way coupling, heterogeneous computing platforms, Docker, asynchronous workflow control, HPC access on-demand, HPC site recommendation
97
2 - Tuesday, 5-7 pmLauraELindzey
University of Washington Applied Physics Laboratory
Making Ice Penetrating Radar More Accessible: A tool for finding, downloading, and visualizing georeferenced radargrams within the QGIS ecosystem
2209726
Ice Penetrating Radar has been a fundamental tool for understanding polar ice sheets since the first flights in the 1960’s. The US alone has spent many tens of millions of dollars on direct grants to enable the acquisition and analysis of radar data, and even more on related infrastructure and support costs. The QIceRadar project tries to make these critical datasets more FAIR. It will index data that is distributed across multiple international data centers (findability) and will provide a consistent interface to the many different formats this data is published in (interoperability). We chose to build this functionality as a QGIS plugin because QGIS is already used by much of the polar community: Quantarctica and QGreenland have rapidly become indispensable data indices for researchers, making diverse data sets readily available. Thus, with no additional work on our part, QIceRadar will enable interpreting radargrams in context with many map-view datasets.
ice penetrating radar; glaciology; QGIS
98
1 - Tuesday, 10-11 amGuoyuLu
University of Georgia
Elements: A Deep Neural Network-based Drone (UAS) Sensing System for 3D Crop Structure Assessment
2104032
The large-scale in-situ 3D reconstruction of crop fields is a challenging task. The 3D crop structures provide critical evidence for plant phenotyping and present the attributes that significantly affect crop growth and yield. With the application of UAV in agriculture, we develop a learning-based unsupervised structure-from-motion (SfM) and visual odometry (VO) framework specifically to reconstruct the large-scale 3D structures of crop fields. We also introduce pose graph and bundle adjustment optimization to our network training process, which iteratively updates both the motion and depth estimations from the deep learning network, and enforces the refined outputs to further meet the unsupervised photometric and geometric constraints. The algorithms built through this project are able to enhance the 3D reconstruction and motion estimation effects for various scenes, such as crop fields and cities. The research outcomes have also been disseminated to students and researchers in AI, robotics, and agriculture, and farmers.
3D crop field reconstruction, AI infrastructure, UAV motion analysis
99
3 - Wednesday, 1030-1130Yung-HsiangLu
Purdue University
CDSE: Collaborative: Cyber Infrastructure to Enable Computer Vision Applications at the Edge Using Automated Contextual Analysis
2104709
This project aims to improve the efficiency of computer vision methods so that they can run on battery-powered edge devices. The method uses earlier layers in a convolutional neural network to determine which pixels are useful. Irrelevant pixels are removed from the remaining layers. This method requres no training and has no loss of accuracy. The project also hosts a cyberinfrastructure for the IEEE Low-Power Computer Vision. In 2023, 54 teams submitted more than 500 solutions. This cyberinfrastructure evaluates the solutions based on acucracy and execution time.
computer vision, edge computing, efficiency
100
1 - Tuesday, 10-11 amYung-HsiangLu
Purdue University
Collaborative Research: OAC Core: Advancing Low-Power Computer Vision at the Edge
2107020
Deep neural networks (DNNs) achieve state-of-the-art performance in many science and engineering domains. However, DNNs are expensive to develop, both in intellectual effort (e.g., devising new architectures) and computational costs (e.g.,training). Re-using DNNs is a promising direction to amortize costs within a company and across the computing industry, including national research laboratories. As with any new technology, however, there are many challenges in re-using DNNs. These challenges include both missing technical capabilities and missing engineering practices. This talk will describe the challenges in current approaches to DNN re-use. We summarize studies of re-use failures across the spectrum of re-use techniques, including conceptual (e.g.,re-using based on a research paper), adaptation (e.g.,re-using bybuilding on an existing implementation), and deployment (e.g.,direct re-use on a new device). We outline possible advances that would improve each kind of re-use. This work has clear implications for the broader CSSI community, which needs to understand how software engineering informs DNN development/deployment and vice versa, much like design patterns influenced the previous generation of traditional software development. Furthermore, continued investment at the intersection of software engineering and machine learning is crucial to ensure the public trust in AI/ML in scientific domains, since software engineering provides the key mechanisms to ensure both transparent and reproducible practices.
Machine learning, Deep learning, Pre-trainedmodels, Re-use, Empirical software engineering, Position, Vision