Rate This Tutorial @ domain-learning.github.io
1
2
Domain Model Learning in AI Planning Tutorial
AAMAS 2026
What’s the Plan for Today?
3
Next: Learning State Abstraction for Planning
Time | Session | Speaker |
08:30–09:15 | Introduction & Domain Learning Basics | Roni Stern |
09:15–10:00 | Offline Learning Action Models | Argaman Mordoch |
10:00-10:30 | Coffee Break ☕ | |
10:30-11.00 | Learning State Abstraction | Roni Stern |
11.00-11.45 | Active Learning and Open Challenges | Argaman Mordoch & Roni Stern |
11.45-12.30 | Hands-on Session | Leonardo Lamanna |
Part 3: Learning State Abstractions
Roni Stern
Ben Gurion University of the Negev
Some slides from: Christian Muise, Masataro Asai
Representation Learning
Learning a Symbolic Representation of the World
In general: hard task, ill-defined, deeply studied (ICLR?)
In particular: a symbolic representation for planning
Still hard, still ill-defined ☺
5
Learning Planning Domain Models
Planning
Problem
Formal Domain+Problem
(STRIPS, PDDL, PDDL+, RDDL, fSTRIPS,…)
Planning
Problem
Planning
Problem
Operator
Observation
Observation
Observation
Observation
Planning Domain Learner
Learning Planning Domain Models
Planning
Problem
Planning
Problem
Planning
Problem
Operator
Observation
Observation
Observation
Observation
Planning Domain Learner
domain.pddl
Model of the environment
Learning Planning Domain Models
domain.pddl
Model of the environment
(:predicates (on ?x ?y)
(smaller ?x ?y)
(clear ?x))
(:action move
:parameters (?x ?y ?z))
Learning Planning Domain Models
domain.pddl
Model of the environment
(:predicates (on ?x ?y)
(smaller ?x ?y)
(clear ?x))
(:action move
:parameters (?x ?y ?z))
What is the input?
Types of Input for Representation Learning
10
A
B
C
Move
(A, B)
A
B
C
A
B
C
Load
(Pkg, C)
Move
(B, C)
A
B
C
A
B
C
Images
Text
Raw set of features
(e.g., sensor data)
Basic set of features
(e.g., goal & init description)
LOCM
SIFT
LLM-based
VisualPredictor
Skills2Symbols
LatPlan
Program Synthesis
I-ROSAME
Learning From Action Sequences
11
A
B
C
Move
(A, B)
A
B
C
A
B
C
Load
(Pkg, C)
Move
(B, C)
A
B
C
Move(A,B) changes some predicates of A or B that have been modified
Move(B,C) changes some predicates of B or C that have been modified
Insight: It’s same change!
Actually, this is an assumption
The “Object-Centric” approach
Types of Input for Representation Learning
12
A
B
C
Move
(A, B)
A
B
C
A
B
C
Load
(Pkg, C)
Move
(B, C)
A
B
C
A
B
C
Move
(A, B)
A
B
C
A
B
C
Load
(Pkg, C)
Move
(B, C)
A
B
C
Images
Text
Raw set of features
(e.g., sensor data)
Basic set of symbols
(e.g., goal description)
Low-level controllers
Learning From Action Sequences
Approach: an “Object-Centric” approach:
13
A
B
C
Move
(A, B)
A
B
C
A
B
C
Load
(Pkg, C)
Move
(B, C)
A
B
C
Move(A,B) changes some predicates of A or B that have been modified
Move(B,C) changes some predicates of B or C that have been modified
Insight: It’s same change!
Actually, this is an assumption
LOCM [Cresswell and Gregory ’11, Cresswell et al. ’13, …]
Assumptions
14
the action Do(x) changes
the state of x to “Done” (Done(x))
If Do(x) changes the state of x to “Done” (Done(x))
then Do(y) changes the state of y to “Done” (Done(y))
Not Done
Done
Do(x)
Undo(x)
Intuition: an FSA state will be a predicate over x
LOCM [Cresswell and Gregory ’11, Cresswell et al. ’13, …]
15
Example
open(o)
fetch-jack(_,o)
fetch-wrench(_,o)
close(o)
S1
S2
S3
S4
S5
S6
S7
S8
LOCM [Cresswell and Gregory ’11, Cresswell et al. ’13, …]
16
open(o)
fetch-jack(_,o)
fetch-wrench(_,o)
close(o)
S1
S2
S3
S4
S5
S6
S7
S8
S2,S3
S4,S5
S6,S7
S1
S8
Example
LOCM [Cresswell and Gregory ’11, Cresswell et al. ’13, …]
17
open(o)
fetch-jack(_,o)
fetch-wrench(_,o)
close(o)
S1
S2
S3
S4
S5
S6
S7
S8
S2,S3
S4,S5
S6,S7
S1
S8
Example
LOCM [Cresswell and Gregory ’11, Cresswell et al. ’13, …]
18
open(o)
fetch-jack(_,o)
fetch-wrench(_,o)
close(o)
S1
S2
S3
S4
S5
S6
S7
S8
S2,S3
S5
S4,S5
S7
S6,S7
S3
S1
S8
Example
LOCM [Cresswell and Gregory ’11, Cresswell et al. ’13, …]
19
open(o)
fetch-jack(_,o)
fetch-wrench(_,o)
close(o)
S1
S2
S3
S4
S5
S6
S7
S8
S2,S3,S4,
S5, S6,S7
S1
S8
Example
LOCM [Cresswell and Gregory ’11, Cresswell et al. ’13, …]
20
open(o)
fetch-jack(_,o)
fetch-wrench(_,o)
close(o)
S1
S2
S3
S4
S5
S6
S7
S8
S2,S3,S4,
S5, S6,S7
S1
S8
Example
LOCM [Cresswell and Gregory ’11, Cresswell et al. ’13, …]
21
open(o)
fetch-jack(_,o)
fetch-wrench(_,o)
close(o)
S1
S2
S3
S4
S5
S6
S7
S8
S2,S3,S4,
S5, S6,S7
S1,S8
Example
LOCM [Cresswell and Gregory ’11, Cresswell et al. ’13, …]
22
S2,S3,S4,
S5, S6,S7
S1,S8
Example
fetch_wrench(_,o)
open(o)
Close(o)
fetch_jack(_,o)
P1(o)
P2(o)
closed(o)?
opened(o)?
Additional details and extensions
SIFT [Gosgens et al. ’25]
Assumption: domain is a well-formed STRIPS domain
23
If Opened(Door) is an effect of Open(Door)
the not(Opened(Door)) is a precondition
SIFT pseudo-code
While not done
SIFT [Gosgens et al. ’25]
24
An action pattern represents
a predicate that is affected by the action
A feature represents a predicate and all the actions affecting it
SIFT [Gosgens et al. ’25]
25
Example
P(?t1)
2. Generate constraints based on traces
SIFT [Gosgens et al. ’25]
26
Example
P(?t1)
2. Generate constraints based on traces
SIFT [Gosgens et al. ’25]
27
Example
P(?t1)
2. Generate constraints based on traces
Is satisfiable?
Yes
SIFT [Gosgens et al. ’25]
28
Example
P(?t1)
2. Generate constraints based on traces
UNSAT!
Note: SAT check here is efficient (2-SAT)
SIFT [Gosgens et al. ’25]
Problem: too many possible “features”
Solution(*): infer object types!
29
SIFT pseudo-code
While not done
SIFT [Gosgens et al. ’25]
Grouping action parameters to infer types
30
Example
open(?x1)
fetch-jack(?x2 ?x3)
fetch-wrench(?x4 ?x5)
close(?x6)
t1?
t1?
t1?
t1?
t2?
t3?
SIFT [Gosgens et al. ’25]
Grouping action parameters to infer types
31
Example
open(?t1)
fetch-jack(?t2 ?t1)
fetch-wrench(?t3 ?t1)
close(?t1)
SIFT [Gosgens et al. ’25]
Many additional details in the paper
…
32
SIFT pseudo-code
While not done
SIFT [Gosgens et al. ’25]
Pros:
Discussion:
33
Learning From Images
34
What would be a good symbolic state?
What would be a good symbolic action?
LatPlan [Masataro and Fukanaga ‘18]
What would be a good symbolic state for an image?
35
Solution: the State AutoEncoder (SAE)
LatPlan [Masataro and Fukanaga ‘18]
What would be a good symbolic state?
36
Solution: the State AutoEncoder (SAE)
DNNs
Latent state representation
= the symbolic state
Use Gamble-Softmax so the latent representation is propositional (Booleans)
LatPlan [Masataro and Fukanaga ‘18]
What would be a good symbolic action?
37
Solution: the Action AutoEncoder (AAE)
DNNs
Latent action representation
= the symbolic action
Use Gamble-Softmax so the latent representation is categorial (Booleans)
LatPlan [Masataro and Fukanaga ‘18]
What would be a good symbolic representation?
38
DNN
DNN
DNN
DNN
DNN
DNN
DNN
LatPlan [Masataro and Fukanaga ‘18]
What would be a good symbolic representation?
39
Cube-Space AutoEncoder [Asai and Muise‘20]
What would be a good symbolic representation?
40
Encodes preconditions and effects “directly”
Many details in the paper
LatPlan and Cube-Space AutoEncoder
Pros:
Discussion:
41
I-ROSAME [Xi et al., ‘24]
42
Learning from Low-Level Control
43
A
B
C
Move
(A, B)
A
B
C
A
B
C
Load
(Pkg, C)
Move
(B, C)
A
B
C
Raw set of features
(e.g., sensor data)
Low-level controllers
(AKA Skills, Options, VLA?…)
Learning from Low-Level Control
44
A
B
C
Move
(A, B)
A
B
C
A
B
C
Load
(Pkg, C)
Move
(B, C)
A
B
C
Raw set of features
(e.g., sensor data)
Low-level controllers
(AKA Skills, Options, VLA?…)
Learning from Low-Level Control
45
Move
(A, B)
Load
(Pkg, C)
Move
(B, C)
Raw set of features
(e.g., sensor data)
Low-level controllers
(AKA Skills, Options, VLA?…)
Learning from Low-Level Skills [Konidaris et al. ’18]
46
Many details in the paper
….
🡺preconditions
🡺effects(*)
Learning from Low-Level Skills [Konidaris et al. ’18]
47
From skills to symbols: Learning symbolic representations for abstract high-level planning. G. Konidaris et al. JAIR 2018
Learning from Low-Level Skills [Konidaris et al. ’18]
48
From skills to symbols: Learning symbolic representations for abstract high-level planning. G. Konidaris et al. JAIR 2018
Predicate Invention for Bi-Level Planning [Silver et al. ’23]
49
VisualPredicator [Liang et al., 2025]
50
VisualPredicator [Liang et al., 2025]
51
Strategy #1 (Discrimination)
“Explain why action a worked in state s and failed in s’”
Strategy #2 (Transition Modeling)
“Explain what has changed after doing action a in state s”
Strategy #3 (Unconditional Generation)
“Suggest useful compositions of existing predicates”
Learning From Text
Learning From Text
53
A
B
C
Move
(A, B)
A
B
C
A
B
C
Load
(Pkg, C)
Move
(B, C)
A
B
C
“He drove the truck from A to B”
“Then he drove from B to C”
“Finally, he picked up the package”
Framer (Lindsay et al. ‘17)
Ancient Pre-Historical Era
(i.e., pre-LLM)
Learning From Text [Lindsay et al. ’17]
54
Learning From Text [Lindsay et al. ’17]
55
Learning From Text [Lindsay et al. ’17]
56
LOCM
Learning From Text [Lindsay et al. ’17]
57
Creating Planning Domain Models with LLMs
58
ACL 2025
LLMs as Planning Formalizers
59
On the Limits of Language Models as Planning Formalizers (Huang & Zhang, ACL ’25)
Prior work suggests LLM as Formalizers is a more robust approach
(but results are somewhat inconclusive)
How Natural is the Natural Language?
60
On the Limits of Language Models as Planning Formalizers (Huang & Zhang, ACL ’25)
LLMs as Planning Formalizers
….
61
LLMs as Planning Formalizers: A Survey for Leveraging Large Language Models to Construct Automated Planning Models (Tantakoun et al., ACL ‘25)
Types of Input for Representation Learning
62
A
B
C
Move
(A, B)
A
B
C
A
B
C
Load
(Pkg, C)
Move
(B, C)
A
B
C
A
B
C
Move
(A, B)
A
B
C
A
B
C
Load
(Pkg, C)
Move
(B, C)
A
B
C
Images
Text
Raw set of features
(e.g., sensor data)
Basic set of features
(e.g., goal & init description)
LOCM
SIFT
FRAMER
LLM-based
VisualPredictor
Skills2Symbols
LatPlan
Program
Synthesis
Summary
63
A
B
C
Move
(A, B)
A
B
C
A
B
C
Load
(Pkg, C)
Move
(B, C)
A
B
C
A
B
C
Images
Text
Raw set of features
(e.g., sensor data)
Basic set of features
(e.g., goal & init description)
LOCM
SIFT
VisualPredictor
Skills2Symbols
LatPlan
Program Synthesis
I-ROSAME
LLM-based
What’s the Plan for Today?
64
Next: Active Learning and Open Challenges
Time | Session | Speaker |
08:30–09:15 | Introduction & Domain Learning Basics | Roni Stern |
09:15–10:00 | Offline Learning Action Models | Argaman Mordoch |
10:00-10:30 | Coffee Break ☕ | |
10:30-11.00 | Learning State Abstraction | Roni Stern |
11.00-11.45 | Active Learning and Open Challenges | Argaman Mordoch & Roni Stern |
11.45-12.30 | Hands-on Session | Leonardo Lamanna |
Argaman Mordoch
BGU
Rate This Tutorial @ domain-learning.github.io
65