Ian Foster
Argonne National Laboratory
The University of Chicago
foster@anl.gov
From Data to Discovery: Advancing AI at Scale with Cross-Facility Collaboration
From Data to Discovery: Advancing AI at Scale with Cross-Facility Collaboration
Or: An Agentic Science � Cloud for AI-enabled � Discovery
Ian Foster
Argonne National Laboratory
The University of Chicago
foster@anl.gov
The scientific method has transformed society
Scientific method
But we are falling behind
Data rates
Total data
Computing
Publications
Complexity
Researchers
Funding
Fraction exploited without innovation
Log
value
Time
Data volumes, problem complexity
*
Despite acceleration via automation
Scientific method
Managed transfer & sync
Reliable automation
Managed remote execution
Unified data access
Publication & discovery
Automation of “easy” tasks surfaces new bottlenecks
Synthesize knowledge and propose hypotheses
Write, debug, and run programs
Configure and run experiments
Interpret results to inform new hypotheses
Tasks performed by humans that emerge as bottlenecks
We need to delegate …
to AI-enabled agents that act on our behalf
Agents that:
We imagine a future with many agent assistants
“Agents for science” are increasingly popular
A computational system that can interact with its �environment and learn from those interactions
Search database, invoke code, query LLM, …
Data repositories, HPC, robotic labs, other agents
Accumulate data, adapt processes, improve answers
How do we build and deploy these things?
But what is an “agent”?
Agentic applications require agentic middleware
Agentic
middleware
Agentic applications
Experimental facilities
Data storage
Compute
An “integrated research infrastructure” for agentic applications
Agentic middleware challenges in science
Not addressed by LangChain, AutoGen, OpenAI Agents, Claude Agents, etc.!
Agentic middleware challenges in scientific computing
Areas we focus on initially …
Under review in IEEE Computer
Not addressed by LangChain, AutoGen, OpenAI Agents, Claude Agents, etc.!
Dr. Greg Pauloski
Dr. Kyle Chard
Globus is a not-for-profit service operated for the research community by the University of Chicago, supported by ~250 subscribing institutions
We assume the Globus hybrid cloud fabric that allows us to authenticate, delegate, start & control programs, manage multi-step flows … anywhere
Exploring agentic middleware: Academy
Client
Handle
Handle
Agent
Control
Actions
State
Handles
Exchange (Data Plane)
Mailbox
Mailbox
Mailbox
Launcher(s) (Control Plane)
Control
Actions
Agent
State
Handles
Dr. Greg Pauloski
Dr. Kyle Chard
Academy middleware prototype: Agent definition
import time, threading�from academy.behavior import Behavior, action, loop��class Example(Behavior):� def __init__(self) -> None:� self.count = 0 # State stored as attributes�� @action� def square(self, value: float) -> float:� return value**2�� @loop� def count(self, shutdown: threading.Event) -> None:� while not shutdown.is_set():� self.count += 1� time.sleep(1) |
Agents defined by a behavior
(e.g., service, embodied, AI)
Clients & other agents can � request actions
Instance of a behavior is state
Control loops for autonomous behavior
Academy middleware prototype: Client usage
from academy.exchange.thread import ThreadExchange�from academy.launcher.thread import ThreadLauncher�from academy.manager import Manager��with Manager(� exchange=ThreadExchange(), # Can be swapped� launcher=ThreadLauncher(),�) as manager:� behavior = Example() # From the prior slide� handle = manager.launch(behavior)� � future = handle.square(2)� assert future.result() == 4�� handle.shutdown() # Or via the manager� manager.shutdown(handle.agent_id, blocking=True) |
Single interface for managing your agents
Choose exchange & launcher for environment
Interact with agents via handles
Pass handles to other agents
Academy use case: MOF discovery
Metal Organic Frameworks (MOFs):
17
Federated Agents |
Intractable search space of ligand, node, & geometry combinations
How to discover MOFs with desirable properties for target applications?
Hypothesize
Publish
Experiment
Study
Set Goals
Simulate
Humans set research goals
Humans research related work
Humans create hypotheses to test
Develop
Humans write code and protocols
Humans run codes, process results
Humans synthesize, test MOFs in lab
Humans publish results
MOF Discovery Cycle
Human-Driven
MOF discovery pipeline
Hypothesize
Publish
Experiment
Study
Set Goals
Simulate
Humans set research goals
Humans research related work
Humans create hypotheses to test
Develop
Humans write code and protocols
Agents run codes, process results
Humans synthesize, test MOFs in lab
Humans publish results
MOF Discovery Cycle
Generate
Assemble
Validate
Optimize
Estimate
AI generated ligands
Assembled candidate MOFs
Structurally stable MOFs
Goal-optimized MOFs
Assessed MOFs
Database
Periodic model retraining
MOFA Workflow
Human-Driven
Automated
MOF discovery accelerated by agentic computation
MOFA online learning + GenAI + simulation code
Federated Agents |
AI Agent
Knowledge Agent
Computational Agents
Yan et al., “MOFA: Discovering materials for carbon capture with a GenAI- and simulation-based workflow” (Under review; https://arxiv.org/abs/2501.10651)
We agentify the code via Academy
Agentified MOFA code easily maps to many resources
Training
Dataset
Generator
Assembler
Estimator
Database
Validator
Optimizer
Chameleon
Cloud
CPUs
Storage
CPUs
Ligands
MOF
Candidates
Stable
MOFs
Optimized
MOFs
CO2
Capacities
Lattice
Strain
Legend
Agent
Resource
Data Flow
Agents executed remotely via Globus Compute
Data moved via Globus transfer
Authentication and authorization via Globus Auth
Benefits of agentic model:
First batches of ligands
MOF buffer fills and Assembler scales down
Validator scales out to start processing MOFs
Optimizer scales out after first validated MOFs
Estimate CO2 of optimized MOFs
Assembler and Estimator auto-scale
Batch job walltime expires
Agentified MOFA application execution trace
Hypothesize
Publish
Experiment
Study
Set Goals
Simulate
Humans set research goals
Agents research related work
Agents create hypotheses to test
Develop
Agents write code and protocols
Agents run codes, process results
Agents synthesize, test MOFs in lab
Humans publish results
MOF Discovery Cycle
Human-Driven
Automated
Further automation via additional agents
Lab agent
Query PubMed for ChatGPT feedstock
AI agent
Priyanka Setty
Arvind Ramanathan
Rory Butler
Ongoing R&D in support of agentic IRI and applications
Ongoing R&D in support of agentic IRI and applications
Streaming support in Globus Transfer service
Node
Node
A
B
Transfer from �file system A to file system B
Flavio Castro
Raj Kettimuthu
Talks Tuesday, Wednesday
Streaming support in Globus Transfer service
Node
Node
Process C
Process D
Transfer from �process C to �process D
Thank you!
Rachana Ananthakrishnan, Ben Blaiszik, Flavio Castro, Kyle Chard, Ryan Chard, Nathaniel Hudson, Eliu Huerta, Raj Kettimuthu, Greg Pauloski, Arvind Ramanathan, and many others
Collaborators at other DOE labs: LBNL, ORNL, SLAC, etc.
Funding:
foster@anl.gov
Summary: Applications are becoming agentic
🡺 Requiring an “agentic science cloud”
Comments, questions: foster@anl.gov
Robot Sisyphus by Amy Kurzweil
Robotic physical labs
Robotic virtual labs
Many trillions of tokens of structured and unstructured scientific data
1000s of robots generate data, test hypotheses
Exascale systems
train models, generate data, �test hypotheses
Embodied learning agents with deep expertise in science principles and practice
Scientific data
Universal data, compute, trust fabric
Scientific agents