1
Domain Model Learning for AI Planning
Tutorial in AAAI 2026
What’s the Plan for Today?
2
Time | Session | Speaker |
08:30–09:15 | Introduction & Domain Learning Basics | Roni Stern |
09:15–09:45 | Learning State Abstractions | Roni Stern |
09:45–10:30 | Offline Learning Domain Models | Leonardo Lamanna |
10:30–11:00 | Coffee Break | |
11:00–11:45 | Hands-on Session | Leonardo Lamanna |
11:45–12:30 | Online Learning and Open Challenges | Roni Stern |
What’s the Plan for Today?
3
Time | Session | Speaker |
08:30–09:15 | Introduction & Domain Learning Basics | Roni Stern |
09:15–09:45 | Learning State Abstractions | Roni Stern |
09:45–10:30 | Offline Learning Domain Models | Leonardo Lamanna |
10:30–11:00 | Coffee Break | |
11:00–11:45 | Hands-on Session | Leonardo Lamanna |
11:45–12:30 | Online Learning and Open Challenges | Roni Stern |
1. What is planning?
Symbolic
Domain -independent
2. Why learn a domain for planning?
3. The domain model learning problem
Domain Model Learning in AI Planning ��Part 1: Introduction & Basics�
Roni Stern
Ben Gurion University of the Negev
Background: Planning
What is Planning?
5
S
A
B
Initial State
Operators
Is goal?
C
D
E
Is goal?
Is goal?
Is goal?
Is goal?
Yes!
A plan!
Goal
Background: Planning
Why Planning?
6
S
A
B
Initial State
Operators
Is goal?
C
D
E
Is goal?
Is goal?
Is goal?
Is goal?
Yes!
Goal
Why Plan if you can Chat?
7
Why Planning?
8
Why Planning?
9
Why Planning?
10
Why Planning?
11
Background: Planning
Why Planning
is Hard?
12
S
A
B
Initial State
Operators
Is goal?
C
D
E
Is goal?
Is goal?
Is goal?
Is goal?
Yes!
A plan!
Goal
Key challenge in planning: Combinatorial search
Background: Domain-Independent Planning
To build a general planning algorithm
we need a way to explain our problem to the AI
Shakey the robot
Planning Domain-Definition Language (PDDL)
14
domain.pddl
problem.pddl
Model of the environment
Model of the current task
One domain to rule them all
PDDL Example: Tower of Hanoi
domain.pddl
(define (domain HANOI)
(:requirements :strips)
(:predicates (on ?x ?y)
(smaller ?x ?y)
(clear ?x))
(:action move
:parameters (?x ?y ?z)
:precondition (and
(on ?x ?y)
(clear ?z) (clear ?x)
(smaller ?x ?z))
:effect
(and (on ?x ?z)
(clear ?y)
(not (on ?x ?y))
(not (clear ?z)))))
PDDL Example: Tower of Hanoi
domain.pddl
(define (domain HANOI)
(:requirements :strips)
(:predicates (on ?x ?y)
(smaller ?x ?y)
(clear ?x))
(:action move
:parameters (?x ?y ?z)
:precondition (and
(on ?x ?y)
(clear ?z) (clear ?x)
(smaller ?x ?z))
:effect
(and (on ?x ?z)
(clear ?y)
(not (on ?x ?y))
(not (clear ?z)))))
problem.pddl
(define (problem HANOI-4-0)
(:domain HANOI)
(:objects A B C D E1 E2 E3)
(:init
(ON A B) (ON B C) (ON C D) (ON D E1)
(CLEAR A) (CLEAR E2) (CLEAR E3)
(SMALLER A B) (SMALLER A C) (SMALLER A D)
(SMALLER B C) (SMALLER B D) (SMALLER C D)
(SMALLER A E1) (SMALLER A E2) (SMALLER A E3)
(SMALLER B E1) (SMALLER B E2) (SMALLER B E3)
(SMALLER C E1) (SMALLER C E2) (SMALLER C E3)
(SMALLER D E1) (SMALLER D E2) (SMALLER D E3))
(:goal (AND (ON A B) (ON B C) (ON C D)
(ON D E3))))
Classical Planning and Beyond
Classical planning (PDDL)
Beyond classical planning
17
The Dream: One Planner to Rule them All
18
Planning
Problem
Formal Domain+Problem
(STRIPS, PDDL, PDDL+, RDDL, fSTRIPS,…)
AI Planner
Plan
How to find a plan in a large state space?
Which formal language to use?
Planning
Problem
?
Learning Planning Domain Models
Planning
Problem
Formal Domain+Problem
(STRIPS, PDDL, PDDL+, RDDL, fSTRIPS,…)
AI Planner
Plan
Planning
Problem
Planning
Problem
?
Operator
Observation
Observation
Observation
Observation
AI Learner
Learning Planning Domain Models
Planning
Problem
Formal Domain+Problem
(STRIPS, PDDL, PDDL+, RDDL, fSTRIPS,…)
AI Planner
Plan
Planning
Problem
Planning
Problem
?
Operator
Observation
Observation
Observation
Observation
Planning Domain Learner
Why Plan if you can just Act?
21
Planning vs. Reinforcement Learning (RL)
Symbolic Planning | Reinforcement Learning |
Achieve goal, minimize cost | Maximize reward |
Declaratively described state space, requires strong assumptions | Unstructured state space, few assumptions needed |
Explicitly designed to exploit domain structure (predicates, signatures, etc.) | Implicit use of domains structure (learned online) |
Offline planning, no errors ;) | Online, requires trial-and-error |
No training, long, problem-specific planning | Long, problem-specific, training |
Formal guarantees on solution quality | Hard to obtain guarantees on solution |
Offline RL
Meta RL
RL+shielding
DNN Verifiers
Why Planning?
23
Why Planning?
24
Why Planning?
25
Why Planning?
26
Why Learn a Planning Domain
instead of just Learning to Act?
27
Learning Planning Domain vs. RL
Learning Planning Domains | Reinforcement Learning | |
Model-Based | Model-Free | |
? | Fewer than model-free | Usually data-intensive |
? | Runtime scales well with data | |
Produces reusable composable domain model that planners can exploit | Produces policy optimizing a fixed reward (can retrain offline) | Produces policy optimizing a fixed reward |
Lifelong RL
Transfer learning
Offline RL
Meta RL
RL+shielding
DNN Verifiers
Learning symbolic planning can be viewed as a special case of MBRL
Why Learn a Symbolic Planning?
…
29
See more on this
Vallati et al. ‘25
Learning Planning Domain Models
Planning
Problem
Formal Domain+Problem
(STRIPS, PDDL, PDDL+, RDDL, fSTRIPS,…)
AI Planner
Plan
Planning
Problem
Planning
Problem
Operator
Observation
Observation
Observation
Observation
AI Learner
FAMA (Aineto et al.),
ARMS (Yang et al.), �LOCM (Cresswell et al.),
NOLAM (Lamanna & Serafini)
….
Deep RL
(e.g., PPO, DQN,…)
can do it better!
Learning the domain is too hard
Learning Planning Domain Models?
31
Deep RL
(e.g., PPO, DQN,…)
can do it better!
Learning the domain is too hard
Is it really?
How far can we go?
What’s the Plan for Today?
32
Time | Session | Speaker |
08:30–09:15 | Introduction & Domain Learning Basics | Roni Stern |
09:15–09:45 | Learning State Abstractions | Roni Stern |
09:45–10:30 | Offline Learning Domain Models | Leonardo Lamanna |
10:30–11:00 | Coffee Break | |
11:00–11:45 | Hands-on Session | Leonardo Lamanna |
11:45–12:30 | Online Learning and Open Challenges | Roni Stern |
1. What is planning?
Symbolic
Domain -independent
2. Why learn a domain for planning?
3. The domain model learning problem
Offline vs. Online Domain Model Learning
33
Observations
Action
State
Action
State
Learning Agent
External Agent
Learning Agent
Offline
Learning
Online
Learning
Active
Learning
Types of Domain Model Learning Problems
34
What Can We Observe?
35
Observations
Action
State
Learning Agent
External Agent
Offline
Learning
Move
(A, B)
Load
(Pkg, C)
Move
(B, C)
A
B
C
A
B
C
A
B
C
A
B
C
AKA: Trace, trajectory, episode, execution, transitions ….
Partial Observability
36
Observations
Action
State
Learning Agent
External Agent
Offline
Learning
A
B
C
Move
(A, B)
A
B
C
A
B
C
Load
(Pkg, C)
Move
(B, C)
A
B
C
Partial Observability
37
Observations
Action
State
Learning Agent
External Agent
Offline
Learning
A
B
C
Move
(A, B)
A
B
C
A
B
C
Load
(Pkg, C)
Move
(B, C)
A
B
C
Partial Observability
38
Observations
Action
State
Learning Agent
External Agent
Offline
Learning
A
B
C
Move
(A, B)
A
B
C
A
B
C
Load
(Pkg, C)
Move
(B, C)
A
B
C
Noisy Observations
39
Observations
Action
State
Learning Agent
External Agent
Offline
Learning
A
B
C
Move
(A, B)
A
B
C
A
B
C
Load
(Pkg, C)
Move
(B, C)
A
B
C
Raw Observations
40
Observations
Action
State
Learning Agent
External Agent
Offline
Learning
A
B
C
Move
(A, B)
A
B
C
A
B
C
Load
(Pkg, C)
Move
(B, C)
A
B
C
Images?
Text?
Sensor data?
Types of Domain Model Learning Problems
41
domain.pddl
(define (domain HANOI)
(:requirements :strips)
(:predicates (on ?x ?y) (smaller ?x ?y) (clear ?x))
(:action move
:parameters (?x ?y ?z)
:precondition
(and (on ?x ?y) (clear ?z)
(clear ?x) (smaller ?x ?z))
:effect
(and (on ?x ?z) (clear ?y)
(not (on ?x ?y)) (not (clear ?z)))))
Representation Learning
Action Model Learning
What’s Next
42
Time | Session | Speaker |
08:30–09:15 | Introduction & Domain Learning Basics | Roni Stern |
09:15–09:45 | Learning State Abstractions | Roni Stern |
09:45–10:30 | Offline Learning of Action Models | Leonardo Lamanna |
10:30–11:00 | Coffee Break | |
11:00–11:45 | Hands-on Session | Leonardo Lamanna |
11:45–12:30 | Online Learning and Open Challenges | Roni Stern |