1 of 21

xLAM: A Family of Large Action Models to Empower AI Agent Systems

NAACL 2025 Oral Presentation

April 30th, 2025

Presenter: Ming Zhu, Zuxin Liu

Salesforce AI Research

2 of 21

LLM Agent - Why?

There are so many potentials for LLM agent…

  • Code Agent
  • Web Agent
  • OS Agent

Niebles, J. C. (2024). Language-based AI Agents and Large Action Models (LAMs).

3 of 21

LLM Agent - Powered by Large Action Models

Large Action Models (LAMs): LLMs trained for actions

Zhang el al. The AgentOhana: Designing Unified Data and Training Pipeline for Effective Agent Learning.

Niebles, J. C. (2024). Language-based AI Agents and Large Action Models (LAMs).

4 of 21

What is a Large Action Model (LAM)?

Action Data: Open source, domain-specific

Foundation LLMs

LLM trained to take action: function calling, reasoning, planning

5 of 21

LAMs for Tool-Use/Function-Calling Agents

Tools (actions) are the interfaces between agent and the environment

Weng, Lilian. (Jun 2023). LLM-powered Autonomous Agents. Lil’Log.

We focus on the tool-usage / function-calling ability in this work.

6 of 21

Function-Calling Agent Workflow

The agent needs to:

  • Understand the user query
  • Select proper functions to call
  • Generate suitable function arguments
  • Summarize the results and present to the user
  • Execute the function and obtain the results

7 of 21

Function-Calling Agent Challenges

  1. How to understand user’s intention and accurately select proper tools?
  • Simple scenario: user’s query only contains one intention
    • return one tool call from a list of APIs
  • Parallel scenario: user’s query only contains multiple intentions
    • return one or more tool calls from a list of APIs
  • Relevance detection: user’s query can not be addressed by the provided APIs
    • should not return any tool calls
  • How to optimize the workflow efficiency and flexibility?
  • Large close-source models perform well, but..
    • have high latency and cost
    • prompting is the only way to adapt them to new applications/APIs
  • We need better open-source and customizable models

8 of 21

xLAM - A Family of Open Large Action Models

9 of 21

xLAM Training Pipeline

10 of 21

xLAM Data Pipeline

11 of 21

xLAM Performance on the Berkeley Function-Calling Leaderboard v2

12 of 21

Dataset Quality Matters

The results highlight the effectiveness of data augmentation and cleaning processes in the data pipeline.

13 of 21

Mobile xLAM’s Potential

14 of 21

Models and Datasets

Provided detailed instructions regarding how to use the model.

Quantized versions (GGUF) also available.

15 of 21

xLAM-2 series are also available

  • They secure top 2 performance on the BFCL-v3 leaderboard
  • We have a series of models available: 1b, 3b, 8b, 32b, 70b

16 of 21

xLAM-2 series are also available

State-of-the-art performance on multi-turn agent benchmarks, including BFCL-v3 and tau-bench

17 of 21

APIGen-MT Pipeline

Agentic Pipeline for Multi-Turn Data Generation via Simulated Agent-Human Interplay

18 of 21

ActionStudio: A lightweight Framework for Data and Training of Large Action models

19 of 21

Future Direction

  • Reduce the latency of inference
    • More concise prompt
    • Embed the knowledge of these APIs inside the model
  • Collect more high-quality data for diverse usage scenarios
    • Expand the API libraries
    • Extend APIGen to collect multi-turn interaction data
  • Extend the data synthesis pipeline beyond simple function-calling
    • Web-agent, OS-agent, etc…

20 of 21

More Work from Our Team

  • APIGen-MT: Agentic PIpeline for Multi-Turn Data Generation via Simulated Agent-Human Interplay
  • ActionStudio: A Lightweight Framework for Data and Training of Large Action Models
  • APIGen: Automated PIpeline for Generating Verifiable and Diverse Function-Calling Datasets

21 of 21

Thanks