1
What’s the Plan for Today?
2
Time | Session | Speaker |
08:30–09:15 | Introduction & Domain Learning Basics | Roni Stern |
09:15–10:00 | Offline Learning Action Models | Leonardo Lamanna |
10:00-10:30 | Coffee Break ☕ | |
10:30-11.00 | Learning State Abstraction | Roni Stern |
11.00-11.45 | Active Learning and Open Challenges | Roni Stern |
11.45-12.30 | Hands-on Session | Leonardo Lamanna |
What’s the Plan for Today?
3
Time | Session | Speaker |
08:30–09:15 | Introduction & Domain Learning Basics | Roni Stern |
09:15–10:00 | Offline Learning Action Models | Argaman Mordoch |
10:00-10:30 | Coffee Break ☕ | |
10:30-11.00 | Learning State Abstraction | Roni Stern |
11.00-11.45 | Active Learning and Open Challenges | Argaman Mordoch & Roni Stern |
11.45-12.30 | Hands-on Session | Leonardo Lamanna |
What’s the Plan for Today?
4
Time | Session | Speaker |
08:30–09:15 | Introduction & Domain Learning Basics | Roni Stern |
09:15–10:00 | Offline Learning Action Models | Argaman Mordoch |
10:00-10:30 | Coffee Break ☕ | |
10:30-11.00 | Learning State Abstraction | Roni Stern |
11.00-11.45 | Active Learning and Open Challenges | Argaman Mordoch & Roni Stern |
11.45-12.30 | Hands-on Session | Leonardo Lamanna |
1. What is planning?
Symbolic
Domain -independent
2. Why learn a domain for planning?
3. The domain model learning problem
Domain Model Learning in AI Planning ��Part 1: Introduction & Basics�
Roni Stern
Ben Gurion University of the Negev
Background: Planning
What is Planning?
6
S
A
B
Initial State
Operators
Is goal?
C
D
E
Is goal?
Is goal?
Is goal?
Is goal?
Yes!
A plan!
Goal
Background: Planning
Why Planning?
7
S
A
B
Initial State
Operators
Is goal?
C
D
E
Is goal?
Is goal?
Is goal?
Is goal?
Yes!
Goal
Why Plan if you can Chat?
8
Why Planning?
9
Why Planning?
10
Why Planning?
11
Why Planning?
12
4 months has passed…
(since the AAAI tutorial)
13
Why Planning?
14
Why Planning?
15
Symbolic Planning
The World Still Needs Symbolic Planners
16
Reasoning trace can be explained
Designed for long-term reasoning
Can output
“no solution”
(instead of hallucinating)
Formal validation of each step
Symbolic Planning
The World Still Needs Symbolic Planners
17
Reasoning trace can be explained
Designed for long-term reasoning
Can output
“no solution”
(instead of hallucinating)
Formal validation of each step
Background: Domain-Independent Planning
To build a general planning algorithm
we need a way to explain our problem to the AI
Shakey the robot
Planning Domain-Definition Language (PDDL)
19
domain.pddl
problem.pddl
Model of the environment
Model of the current task
One domain to rule them all
PDDL Example: Tower of Hanoi
domain.pddl
(define (domain HANOI)
(:requirements :strips)
(:predicates (on ?x ?y)
(smaller ?x ?y)
(clear ?x))
(:action move
:parameters (?x ?y ?z)
:precondition (and
(on ?x ?y)
(clear ?z) (clear ?x)
(smaller ?x ?z))
:effect
(and (on ?x ?z)
(clear ?y)
(not (on ?x ?y))
(not (clear ?z)))))
PDDL Example: Tower of Hanoi
domain.pddl
(define (domain HANOI)
(:requirements :strips)
(:predicates (on ?x ?y)
(smaller ?x ?y)
(clear ?x))
(:action move
:parameters (?x ?y ?z)
:precondition (and
(on ?x ?y)
(clear ?z) (clear ?x)
(smaller ?x ?z))
:effect
(and (on ?x ?z)
(clear ?y)
(not (on ?x ?y))
(not (clear ?z)))))
problem.pddl
(define (problem HANOI-4-0)
(:domain HANOI)
(:objects A B C D E1 E2 E3)
(:init
(ON A B) (ON B C) (ON C D) (ON D E1)
(CLEAR A) (CLEAR E2) (CLEAR E3)
(SMALLER A B) (SMALLER A C) (SMALLER A D)
(SMALLER B C) (SMALLER B D) (SMALLER C D)
(SMALLER A E1) (SMALLER A E2) (SMALLER A E3)
(SMALLER B E1) (SMALLER B E2) (SMALLER B E3)
(SMALLER C E1) (SMALLER C E2) (SMALLER C E3)
(SMALLER D E1) (SMALLER D E2) (SMALLER D E3))
(:goal (AND (ON A B) (ON B C) (ON C D)
(ON D E3))))
Classical Planning and Beyond
Classical planning (PDDL)
Beyond classical planning
22
The Dream: One Planner to Rule them All
Planning
Problem
Formal Domain+Problem
(STRIPS, PDDL, PDDL+, RDDL, fSTRIPS,…)
AI Planner
Plan
Planning
Problem
How to find a plan in a large state space?
Which formal language to use?
?
Why Planning
is Hard?
Symbolic Planning
The World Still Needs Symbolic Planners
24
Reasoning trace can be explained
Designed for long-term reasoning
Can output
“no solution”
(instead of hallucinating)
Formal validation of each step
Learning Planning Domain Models
Planning
Problem
Formal Domain+Problem
(STRIPS, PDDL, PDDL+, RDDL, fSTRIPS,…)
AI Planner
Plan
Planning
Problem
Planning
Problem
?
Operator
Observation
Observation
Observation
Observation
AI Learner
Learning Planning Domain Models
Planning
Problem
Formal Domain+Problem
(STRIPS, PDDL, PDDL+, RDDL, fSTRIPS,…)
AI Planner
Plan
Planning
Problem
Planning
Problem
?
Operator
Observation
Observation
Observation
Observation
Planning Domain Learner
Learning Planning Domain Models
Planning
Problem
Formal Domain+Problem
(STRIPS, PDDL, PDDL+, RDDL, fSTRIPS,…)
AI Planner
Plan
Planning
Problem
Planning
Problem
Operator
Observation
Observation
Observation
Observation
AI Learner
Learning Planning Domain Models
Planning
Problem
Formal Domain+Problem
(STRIPS, PDDL, PDDL+, RDDL, fSTRIPS,…)
AI Planner
Plan
Planning
Problem
Planning
Problem
Operator
Observation
Observation
Observation
Observation
Planning Domain Learner
Why Plan if you can just Act?
29
Planning vs. Reinforcement Learning (RL)
Symbolic Planning | Reinforcement Learning |
Achieve goal, minimize cost | Maximize reward |
Declaratively described state space, requires strong assumptions | Unstructured state space, few assumptions needed |
Explicitly designed to exploit domain structure (predicates, signatures, etc.) | Implicit use of domains structure (learned online) |
Offline planning, no errors ;) | Online, requires trial-and-error |
No training, long, problem-specific planning | Long, problem-specific, training |
Formal guarantees on solution quality | Hard to obtain guarantees on solution |
Offline RL
Meta RL
RL+shielding
DNN Verifiers
Why Planning?
31
Why Planning?
32
Why Planning?
33
Why Planning?
34
Why Learn a Planning Domain
instead of just Learning to Act?
35
Learning Planning Domain vs. RL
Learning Planning Domains | Reinforcement Learning | |
Model-Based | Model-Free | |
? | Fewer than model-free | Usually data-intensive |
? | Runtime scales well with data | |
Produces reusable composable domain model that planners can exploit | Produces policy optimizing a fixed reward (can retrain offline) | Produces policy optimizing a fixed reward |
Lifelong RL
Transfer learning
Offline RL
Meta RL
RL+shielding
DNN Verifiers
Learning symbolic planning can be viewed as a special case of MBRL
Why Learn a Symbolic Planning?
…
37
See more on this
Vallati et al. ‘25
Learning Planning Domain Models
Planning
Problem
Formal Domain+Problem
(STRIPS, PDDL, PDDL+, RDDL, fSTRIPS,…)
AI Planner
Plan
Planning
Problem
Planning
Problem
Operator
Observation
Observation
Observation
Observation
AI Learner
FAMA (Aineto et al.),
ARMS (Yang et al.), �LOCM (Cresswell et al.),
NOLAM (Lamanna & Serafini)
….
Deep RL
(e.g., PPO, DQN,…)
can do it better!
Learning the domain is too hard
Learning Planning Domain Models?
39
Deep RL
(e.g., PPO, DQN,…)
can do it better!
Learning the domain is too hard
Is it really?
How far can we go?
What’s the Plan for Today?
40
1. What is planning?
Symbolic
Domain -independent
2. Why learn a domain for planning?
3. The domain model learning problem
Time | Session | Speaker |
08:30–09:15 | Introduction & Domain Learning Basics | Roni Stern |
09:15–10:00 | Offline Learning Action Models | Argaman Mordoch |
10:00-10:30 | Coffee Break ☕ | |
10:30-11.00 | Learning State Abstraction | Roni Stern |
11.00-11.45 | Active Learning and Open Challenges | Argaman Mordoch & Roni Stern |
11.45-12.30 | Hands-on Session | Leonardo Lamanna |
Types of Domain Model Learning Problems
41
domain.pddl
(define (domain HANOI)
(:requirements :strips)
(:predicates (on ?x ?y) (smaller ?x ?y) (clear ?x))
(:action move
:parameters (?x ?y ?z)
:precondition
(and (on ?x ?y) (clear ?z)
(clear ?x) (smaller ?x ?z))
:effect
(and (on ?x ?z) (clear ?y)
(not (on ?x ?y)) (not (clear ?z)))))
Representation Learning
Action Model Learning
Offline vs. Online Domain Model Learning
42
Observations
Action
State
Action
State
Learning Agent
External Agent
Learning Agent
Offline
Learning
Online
Learning
Active
Learning
Types of Domain Model Learning Problems
43
What Can We Observe?
44
Observations
Action
State
Learning Agent
External Agent
Offline
Learning
Move
(A, B)
Load
(Pkg, C)
Move
(B, C)
A
B
C
A
B
C
A
B
C
A
B
C
AKA: Trace, trajectory, episode, execution, transitions ….
Partial Observability
45
Observations
Action
State
Learning Agent
External Agent
Offline
Learning
A
B
C
Move
(A, B)
A
B
C
A
B
C
Load
(Pkg, C)
Move
(B, C)
A
B
C
Partial Observability
46
Observations
Action
State
Learning Agent
External Agent
Offline
Learning
A
B
C
Move
(A, B)
A
B
C
A
B
C
Load
(Pkg, C)
Move
(B, C)
A
B
C
Partial Observability
47
Observations
Action
State
Learning Agent
External Agent
Offline
Learning
A
B
C
Move
(A, B)
A
B
C
A
B
C
Load
(Pkg, C)
Move
(B, C)
A
B
C
Noisy Observations
48
Observations
Action
State
Learning Agent
External Agent
Offline
Learning
A
B
C
Move
(A, B)
A
B
C
A
B
C
Load
(Pkg, C)
Move
(B, C)
A
B
C
Raw Observations
49
Observations
Action
State
Learning Agent
External Agent
Offline
Learning
A
B
C
Move
(A, B)
A
B
C
A
B
C
Load
(Pkg, C)
Move
(B, C)
A
B
C
Images?
Text?
Sensor data?
What’s the Plan for Today?
50
Next: Offline Learning of Action Models
Time | Session | Speaker |
08:30–09:15 | Introduction & Domain Learning Basics | Roni Stern |
09:15–10:00 | Offline Learning Action Models | Argaman Mordoch |
10:00-10:30 | Coffee Break ☕ | |
10:30-11.00 | Learning State Abstraction | Roni Stern |
11.00-11.45 | Active Learning and Open Challenges | Argaman Mordoch & Roni Stern |
11.45-12.30 | Hands-on Session | Leonardo Lamanna |
Argaman Mordoch
BGU
Rate This Tutorial @ domain-learning.github.io
51