1
Domain Model Learning in AI Planning
Tutorial in AAAI 2026
Part 2: Learning State Abstractions
Roni Stern
Ben Gurion University of the Negev
Some slides from: Christian Muise, Masataro Asai
Representation Learning
Learning a Symbolic Representation of the World
In general: hard task, ill-defined, deeply studied (ICLR?)
In particular: a symbolic representation for planning
Still hard, still ill-defined ☺
3
Learning Planning Domain Models
Planning
Problem
Formal Domain+Problem
(STRIPS, PDDL, PDDL+, RDDL, fSTRIPS,…)
Planning
Problem
Planning
Problem
Operator
Observation
Observation
Observation
Observation
Planning Domain Learner
Learning Planning Domain Models
Planning
Problem
Planning
Problem
Planning
Problem
Operator
Observation
Observation
Observation
Observation
Planning Domain Learner
domain.pddl
Model of the environment
Learning Planning Domain Models
domain.pddl
Model of the environment
(:predicates (on ?x ?y)
(smaller ?x ?y)
(clear ?x))
(:action move
:parameters (?x ?y ?z))
Learning Planning Domain Models
domain.pddl
Model of the environment
(:predicates (on ?x ?y)
(smaller ?x ?y)
(clear ?x))
(:action move
:parameters (?x ?y ?z))
What is the input?
Learning From Action Sequences
8
A
B
C
Move
(A, B)
A
B
C
A
B
C
Load
(Pkg, C)
Move
(B, C)
A
B
C
Move(A,B) changes some predicates of A or B that have been modified
Move(B,C) changes some predicates of B or C that have been modified
Insight: It’s same change!
Actually, this is an assumption
The “Object-Centric” approach
Types of Input for Representation Learning
9
A
B
C
Move
(A, B)
A
B
C
A
B
C
Load
(Pkg, C)
Move
(B, C)
A
B
C
A
B
C
Move
(A, B)
A
B
C
A
B
C
Load
(Pkg, C)
Move
(B, C)
A
B
C
Images
Text
Raw set of features
(e.g., sensor data)
Basic set of symbols
(e.g., goal description)
Low-level controllers
Learning From Action Sequences
Approach: an “Object-Centric” approach:
10
A
B
C
Move
(A, B)
A
B
C
A
B
C
Load
(Pkg, C)
Move
(B, C)
A
B
C
Move(A,B) changes some predicates of A or B that have been modified
Move(B,C) changes some predicates of B or C that have been modified
Insight: It’s same change!
Actually, this is an assumption
LOCM [Cresswell and Gregory ’11, Cresswell et al. ’13, …]
Assumptions
- Changes in the state of an object is a Finite State Automata
11
Do(x) changes the state of x to “done” (done(x))
If Do(x) changes the state of x to “done” (done(x))
then Do(y) changes the state of y to “done” (done(y))
Not Done
Done
Do(x)
Undo(x)
Intuition: a FSA state will be a predicate(x)
LOCM [Cresswell and Gregory ’11, Cresswell et al. ’13, …]
12
Example
open(o)
fetch-jack(_,o)
fetch-wrench(_,o)
close(o)
S1
S2
S3
S4
S5
S6
S7
S8
LOCM [Cresswell and Gregory ’11, Cresswell et al. ’13, …]
13
open(o)
fetch-jack(_,o)
fetch-wrench(_,o)
close(o)
S1
S2
S3
S4
S5
S6
S7
S8
S2,S3
S4,S5
S6,S7
S1
S8
Example
LOCM [Cresswell and Gregory ’11, Cresswell et al. ’13, …]
14
open(o)
fetch-jack(_,o)
fetch-wrench(_,o)
close(o)
S1
S2
S3
S4
S5
S6
S7
S8
S2,S3
S4,S5
S6,S7
S1
S8
Example
LOCM [Cresswell and Gregory ’11, Cresswell et al. ’13, …]
15
open(o)
fetch-jack(_,o)
fetch-wrench(_,o)
close(o)
S1
S2
S3
S4
S5
S6
S7
S8
S2,S3
S5
S4,S5
S7
S6,S7
S3
S1
S8
Example
LOCM [Cresswell and Gregory ’11, Cresswell et al. ’13, …]
16
open(o)
fetch-jack(_,o)
fetch-wrench(_,o)
close(o)
S1
S2
S3
S4
S5
S6
S7
S8
S2,S3,S4,
S5, S6,S7
S1
S8
Example
LOCM [Cresswell and Gregory ’11, Cresswell et al. ’13, …]
17
open(o)
fetch-jack(_,o)
fetch-wrench(_,o)
close(o)
S1
S2
S3
S4
S5
S6
S7
S8
S2,S3,S4,
S5, S6,S7
S1
S8
Example
LOCM [Cresswell and Gregory ’11, Cresswell et al. ’13, …]
18
open(o)
fetch-jack(_,o)
fetch-wrench(_,o)
close(o)
S1
S2
S3
S4
S5
S6
S7
S8
S2,S3,S4,
S5, S6,S7
S1,S8
Example
LOCM [Cresswell and Gregory ’11, Cresswell et al. ’13, …]
19
S2,S3,S4,
S5, S6,S7
S1,S8
Example
fetch_wrench(_,o)
open(o)
Close(o)
fetch_jack(_,o)
P1(o)
P2(o)
closed(o)?
opened(o)?
Additional details and extensions
SIFT [Gosgens et al. ’25]
Assumption: domain is a well-formed STRIPS domain
20
If opened(door) is an effect of op(door)
the not(opened(door)) is a precondition
SIFT pseudo-code
While not done
SIFT [Gosgens et al. ’25]
21
An action pattern represents
a predicate that is affected by the action
A feature represents a predicate and all the actions affecting it
SIFT [Gosgens et al. ’25]
22
Example
P(?t1)
2. Generate constraints based on traces
SIFT [Gosgens et al. ’25]
23
Example
P(?t1)
2. Generate constraints based on traces
SIFT [Gosgens et al. ’25]
24
Example
P(?t1)
2. Generate constraints based on traces
Is satisfiable?
Yes
SIFT [Gosgens et al. ’25]
25
Example
P(?t1)
2. Generate constraints based on traces
UNSAT!
Note: SAT check here is efficient (2-SAT)
SIFT [Gosgens et al. ’25]
Problem: too many possible “features”
Solution(*): infer object types!
26
SIFT pseudo-code
While not done
SIFT [Gosgens et al. ’25]
Grouping action parameters to infer types
27
Example
open(?x1)
fetch-jack(?x2 ?x3)
fetch-wrench(?x4 ?x5)
close(?x6)
t1?
t1?
t1?
t1?
t2?
t3?
SIFT [Gosgens et al. ’25]
Grouping action parameters to infer types
28
Example
open(?t1)
fetch-jack(?t2 ?t1)
fetch-wrench(?t3 ?t1)
close(?t1)
SIFT [Gosgens et al. ’25]
Many additional details in the paper
…
29
SIFT pseudo-code
While not done
SIFT
Pros:
Discussion:
30
Learning From Text [Lindsay et al. ’17]
31
A
B
C
Move
(A, B)
A
B
C
A
B
C
Load
(Pkg, C)
Move
(B, C)
A
B
C
“He drove the truck from A to B”
“Then he drove from B to C”
“Finally, he picked up the package”
Framer (Lindsay et al. ‘17)
Learning From Text [Lindsay et al. ’17]
32
Learning From Text [Lindsay et al. ’17]
33
Learning From Text [Lindsay et al. ’17]
34
LOCM
Learning From Text [Lindsay et al. ’17]
35
No LLM?!?!
36
Creating Planning Domain Models with LLMs
(Oates et al. ‘24, Zhang et al. ‘24)
37
ACL 2025
Learning From Images
38
A
B
C
Move
(A, B)
A
B
C
A
B
C
Load
(Pkg, C)
Move
(B, C)
A
B
C
Images
LatPlan [Masataro and Fukanaga ‘18]
What would be a good symbolic state for an image?
39
Solution: the State AutoEncoder (SAE)
LatPlan [Masataro and Fukanaga ‘18]
What would be a good symbolic state for an image?
40
Solution: the State AutoEncoder (SAE)
DNNs
Latent state representation
= the symbolic state
Use Gamble-Softmax so the latent representation is categorial (Booleans)
LatPlan [Masataro and Fukanaga ‘18]
What would be a good symbolic action?
41
Solution: the Action AutoEncoder (AAE)
DNNs
Latent action representation
= the symbolic action
Use Gamble-Softmax so the latent representation is categorial (Booleans)
LatPlan [Masataro and Fukanaga ‘18]
What would be a good symbolic representation?
42
DNN
DNN
DNN
DNN
DNN
DNN
DNN
LatPlan [Masataro and Fukanaga ‘18]
What would be a good symbolic representation?
43
Cube-Space AutoEncoder [Asai and Muise‘20]
What would be a good symbolic representation?
44
Encodes preconditions and effects “directly”
Many details in the paper
LatPlan and Cube-Space AutoEncoder
Pros:
Discussion:
45
I-ROSAME [Xi et al., ‘24]
46
Learning from Low-Level Control
47
A
B
C
Move
(A, B)
A
B
C
A
B
C
Load
(Pkg, C)
Move
(B, C)
A
B
C
Raw set of features
(e.g., sensor data)
Low-level controllers
(AKA Skills, Options, VLA?…)
Learning from Low-Level Skills [Konidaris et al. ’18]
48
Many details in the paper
….
🡺preconditions
🡺effects(*)
Learning from Low-Level Skills [Konidaris et al. ’18]
49
From skills to symbols: Learning symbolic representations for abstract high-level planning. G. Konidaris et al. JAIR 2018
Learning from Low-Level Skills [Konidaris et al. ’18]
50
From skills to symbols: Learning symbolic representations for abstract high-level planning. G. Konidaris et al. JAIR 2018
Predicate Invention for Bi-Level Planning [Silver et al. ’23]
51
VisualPredicator [Liang et al., 2025]
52
VisualPredicator [Liang et al., 2025]
53
Strategy #1 (Discrimination)
“Explain why action a worked in state s and failed in s’”
Strategy #2 (Transition Modeling)
“Explain what has changed after doing action a in state s”
Strategy #3 (Unconditional Generation)
“Suggest useful compositions of existing predicates”
Types of Input for Representation Learning
54
A
B
C
Move
(A, B)
A
B
C
A
B
C
Load
(Pkg, C)
Move
(B, C)
A
B
C
A
B
C
Move
(A, B)
A
B
C
A
B
C
Load
(Pkg, C)
Move
(B, C)
A
B
C
Images
Text
Raw set of features
(e.g., sensor data)
Basic set of features
(e.g., goal & init description)
LOCM
SIFT
FRAMER
LLM-based
VisualPredictor
Skills2Symbols
LatPlan
Program
Synthesis
What’s Next?
Given: symbolic states, actions, and traces
Not given: actions’ preconditions and effects
55
Time | Session | Speaker |
08:30–09:15 | Introduction & Domain Learning Basics | Roni Stern |
09:15–09:45 | Learning State Abstractions | Roni Stern |
09:45–10:30 | Offline Learning Domain Models | Leonardo Lamanna |
10:30–11:00 | Coffee Break | |
11:00–11:45 | Hands-on Session | Leonardo Lamanna |
11:45–12:30 | Online Learning and Open Challenges | Roni Stern |