1 of 65

Rate This Tutorial @ domain-learning.github.io

1

2 of 65

2

Domain Model Learning in AI Planning Tutorial

AAMAS 2026

3 of 65

What’s the Plan for Today?

3

Next: Learning State Abstraction for Planning

Time	Session	Speaker
08:30–09:15	Introduction & Domain Learning Basics	Roni Stern
09:15–10:00	Offline Learning Action Models	Argaman Mordoch
10:00-10:30	Coffee Break ☕
10:30-11.00	Learning State Abstraction	Roni Stern
11.00-11.45	Active Learning and Open Challenges	Argaman Mordoch & Roni Stern
11.45-12.30	Hands-on Session	Leonardo Lamanna

4 of 65

Part 3: Learning State Abstractions

Roni Stern

Ben Gurion University of the Negev

Some slides from: Christian Muise, Masataro Asai

5 of 65

Representation Learning

Learning a Symbolic Representation of the World

In general: hard task, ill-defined, deeply studied (ICLR?)

In particular: a symbolic representation for planning

Still hard, still ill-defined ☺

5

6 of 65

Learning Planning Domain Models

Planning

Problem

Formal Domain+Problem

(STRIPS, PDDL, PDDL+, RDDL, fSTRIPS,…)

Planning

Problem

Planning

Problem

Operator

Observation

Planning Domain Learner

7 of 65

Learning Planning Domain Models

Planning

Problem

Planning

Problem

Planning

Problem

Operator

Observation

Planning Domain Learner

domain.pddl

How to represent a state?
What operators do we have?
How do they work?

Model of the environment

8 of 65

Learning Planning Domain Models

domain.pddl

How to represent a state?
What operators do we have?
How do they work?

Model of the environment

(:predicates (on ?x ?y)

(smaller ?x ?y)

(clear ?x))

(:action move

:parameters (?x ?y ?z))

9 of 65

Learning Planning Domain Models

domain.pddl

How to represent a state?
What operators do we have?
How do they work?

Model of the environment

(:predicates (on ?x ?y)

(smaller ?x ?y)

(clear ?x))

(:action move

:parameters (?x ?y ?z))

What is the input?

10 of 65

Types of Input for Representation Learning

10

A

B

C

Move

(A, B)

A

B

C

A

B

C

Load

(Pkg, C)

Move

(B, C)

A

B

C

A

B

C

Images

Text

Raw set of features

(e.g., sensor data)

Basic set of features

(e.g., goal & init description)

LOCM

SIFT

LLM-based

VisualPredictor

Skills2Symbols

LatPlan

Program Synthesis

I-ROSAME

11 of 65

Learning From Action Sequences

11

A

B

C

Move

(A, B)

A

B

C

A

B

C

Load

(Pkg, C)

Move

(B, C)

A

B

C

Move(A,B) changes some predicates of A or B that have been modified

Move(B,C) changes some predicates of B or C that have been modified

Insight: It’s same change!

Actually, this is an assumption

The “Object-Centric” approach

Learn how action affect their parameters
Suggest sufficient predicates to encode these effects

12 of 65

Types of Input for Representation Learning

12

A

B

C

Move

(A, B)

A

B

C

A

B

C

Load

(Pkg, C)

Move

(B, C)

A

B

C

A

B

C

Move

(A, B)

A

B

C

A

B

C

Load

(Pkg, C)

Move

(B, C)

A

B

C

Images

Text

Raw set of features

(e.g., sensor data)

Basic set of symbols

(e.g., goal description)

Low-level controllers

13 of 65

Learning From Action Sequences

Approach: an “Object-Centric” approach:

Learn how action affect their parameters
Suggest sufficient “features” to fit assumptions

13

A

B

C

Move

(A, B)

A

B

C

A

B

C

Load

(Pkg, C)

Move

(B, C)

A

B

C

Move(A,B) changes some predicates of A or B that have been modified

Move(B,C) changes some predicates of B or C that have been modified

Insight: It’s same change!

Actually, this is an assumption

14 of 65

LOCM [Cresswell and Gregory ’11, Cresswell et al. ’13, …]

Assumptions

Objects have a “state”
Actions affects the “states” of its parameters

Changes in the state of an object is a Finite State Automata

Actions with the same name transition similarly

14

the action Do(x) changes

the state of x to “Done” (Done(x))

If Do(x) changes the state of x to “Done” (Done(x))

then Do(y) changes the state of y to “Done” (Done(y))

Not Done

Done

Do(x)

Undo(x)

Intuition: an FSA state will be a predicate over x

15 of 65

LOCM [Cresswell and Gregory ’11, Cresswell et al. ’13, …]

15

Example

(open c1)
(fetch-jack j1 c1)
(fetch-wrench wr1 c1)
(close c1)
(open c2)
(fetch-wrench wr2 c2)
(fetch-jack j2 c2)
(close c2)
(close c3)
(open c3)

open(o)

fetch-jack(_,o)

fetch-wrench(_,o)

close(o)

S1

S2

S3

S4

S5

S6

S7

S8

16 of 65

LOCM [Cresswell and Gregory ’11, Cresswell et al. ’13, …]

16

open(o)

fetch-jack(_,o)

fetch-wrench(_,o)

close(o)

S1

S2

S3

S4

S5

S6

S7

S8

S2,S3

S4,S5

S6,S7

S1

S8

Example

(open c1)
(fetch-jack j1 c1)
(fetch-wrench wr1 c1)
(close c1)
(open c2)
(fetch-wrench wr2 c2)
(fetch-jack j2 c2)
(close c2)
(close c3)
(open c3)

17 of 65

LOCM [Cresswell and Gregory ’11, Cresswell et al. ’13, …]

17

open(o)

fetch-jack(_,o)

fetch-wrench(_,o)

close(o)

S1

S2

S3

S4

S5

S6

S7

S8

S2,S3

S4,S5

S6,S7

S1

S8

Example

(open c1)
(fetch-jack j1 c1)
(fetch-wrench wr1 c1)
(close c1)
(open c2)
(fetch-wrench wr2 c2)
(fetch-jack j2 c2)
(close c2)
(close c3)
(open c3)

18 of 65

LOCM [Cresswell and Gregory ’11, Cresswell et al. ’13, …]

18

open(o)

fetch-jack(_,o)

fetch-wrench(_,o)

close(o)

S1

S2

S3

S4

S5

S6

S7

S8

S2,S3

S5

S4,S5

S7

S6,S7

S3

S1

S8

Example

(open c1)
(fetch-jack j1 c1)
(fetch-wrench wr1 c1)
(close c1)
(open c2)
(fetch-wrench wr2 c2)
(fetch-jack j2 c2)
(close c2)
(close c3)
(open c3)

19 of 65

LOCM [Cresswell and Gregory ’11, Cresswell et al. ’13, …]

19

open(o)

fetch-jack(_,o)

fetch-wrench(_,o)

close(o)

S1

S2

S3

S4

S5

S6

S7

S8

S2,S3,S4,

S5, S6,S7

S1

S8

Example

(open c1)
(fetch-jack j1 c1)
(fetch-wrench wr1 c1)
(close c1)
(open c2)
(fetch-wrench wr2 c2)
(fetch-jack j2 c2)
(close c2)
(close c3)
(open c3)

20 of 65

LOCM [Cresswell and Gregory ’11, Cresswell et al. ’13, …]

20

open(o)

fetch-jack(_,o)

fetch-wrench(_,o)

close(o)

S1

S2

S3

S4

S5

S6

S7

S8

S2,S3,S4,

S5, S6,S7

S1

S8

Example

(open c1)
(fetch-jack j1 c1)
(fetch-wrench wr1 c1)
(close c1)
(open c2)
(fetch-wrench wr2 c2)
(fetch-jack j2 c2)
(close c2)
(close c3)
(open c3)

21 of 65

LOCM [Cresswell and Gregory ’11, Cresswell et al. ’13, …]

21

open(o)

fetch-jack(_,o)

fetch-wrench(_,o)

close(o)

S1

S2

S3

S4

S5

S6

S7

S8

S2,S3,S4,

S5, S6,S7

S1,S8

Example

(open c1)
(fetch-jack j1 c1)
(fetch-wrench wr1 c1)
(close c1)
(open c2)
(fetch-wrench wr2 c2)
(fetch-jack j2 c2)
(close c2)
(close c3)
(open c3)

22 of 65

LOCM [Cresswell and Gregory ’11, Cresswell et al. ’13, …]

22

S2,S3,S4,

S5, S6,S7

S1,S8

Example

(open c1)
(fetch-jack j1 c1)
(fetch-wrench wr1 c1)
(close c1)
(open c2)
(fetch-wrench wr2 c2)
(fetch-jack j2 c2)
(close c2)
(close c3)
(open c3)

fetch_wrench(_,o)

open(o)

Close(o)

fetch_jack(_,o)

P1(o)

P2(o)

closed(o)?

opened(o)?

Additional details and extensions

Zero-parameter predicates (e.g., hand-empty)
Detecting predicates with multiple parameters (LOCM2)
Learning action costs (N-LOCM)

23 of 65

SIFT [Gosgens et al. ’25]

Assumption: domain is a well-formed STRIPS domain

23

If Opened(Door) is an effect of Open(Door)

the not(Opened(Door)) is a precondition

SIFT pseudo-code

While not done

Suggest possible predicates and affecting actions
Generate constraints based on traces
If there exists satisfying action model – done!

24 of 65

SIFT [Gosgens et al. ’25]

24

An action pattern represents

a predicate that is affected by the action

A feature represents a predicate and all the actions affecting it

Suggest possible predicates and affecting actions

25 of 65

SIFT [Gosgens et al. ’25]

25

Example

(open c1)
(fetch-jack j1 c1)
(fetch-wrench wr1 c1)
(close c1)
(open c2)
(fetch-wrench wr2 c2)
(fetch-jack j2 c2)
(close c2)

P(?t1)

2. Generate constraints based on traces

26 of 65

SIFT [Gosgens et al. ’25]

26

Example

(open c1)
(fetch-jack j1 c1)
(fetch-wrench wr1 c1)
(close c1)
(open c2)
(fetch-wrench wr2 c2)
(fetch-jack j2 c2)
(close c2)

P(?t1)

2. Generate constraints based on traces

27 of 65

SIFT [Gosgens et al. ’25]

27

Example

(open c1)
(fetch-jack j1 c1)
(fetch-wrench wr1 c1)
(close c1)
(open c2)
(fetch-wrench wr2 c2)
(fetch-jack j2 c2)
(close c2)

P(?t1)

2. Generate constraints based on traces

Is satisfiable?

Yes

28 of 65

SIFT [Gosgens et al. ’25]

28

Example

(open c1)
(fetch-jack j1 c1)
(fetch-wrench wr1 c1)
(close c1)
(open c2)
(fetch-wrench wr2 c2)
(fetch-jack j2 c2)
(close c2)

P(?t1)

2. Generate constraints based on traces

UNSAT!

Note: SAT check here is efficient (2-SAT)

29 of 65

SIFT [Gosgens et al. ’25]

Problem: too many possible “features”

Solution(*): infer object types!

29

SIFT pseudo-code

While not done

Suggest possible predicates and affecting actions
Generate constraints based on traces
If the exists satisfying action model – done!

30 of 65

SIFT [Gosgens et al. ’25]

Grouping action parameters to infer types

30

Example

(open c1)
(fetch-jack j1 c1)
(fetch-wrench wr1 c1)
(close c1)
(open c2)
(fetch-wrench wr2 c2)
(fetch-jack j2 c2)
(close c2)

open(?x1)

fetch-jack(?x2 ?x3)

fetch-wrench(?x4 ?x5)

close(?x6)

t1?

t2?

t3?

31 of 65

SIFT [Gosgens et al. ’25]

Grouping action parameters to infer types

31

Example

(open c1)
(fetch-jack j1 c1)
(fetch-wrench wr1 c1)
(close c1)
(open c2)
(fetch-wrench wr2 c2)
(fetch-jack j2 c2)
(close c2)

open(?t1)

fetch-jack(?t2 ?t1)

fetch-wrench(?t3 ?t1)

close(?t1)

32 of 65

SIFT [Gosgens et al. ’25]

Many additional details in the paper

Handling additional information on traces (extended traces)
Static predicates
Completeness theorems

…

32

SIFT pseudo-code

While not done

Suggest possible predicates and affecting actions
Generate constraints based on traces
If the exists satisfying action model – done!

33 of 65

SIFT [Gosgens et al. ’25]

Pros:

Learn only from action traces!
No supervision required
Highly scalable (?)
Works reasonably on benchmarks

Discussion:

Is the well-formed STRIPS assumption reasonable?
Losing explainability?
Is not knowing anything about the states practical?

33

34 of 65

Learning From Images

34

What would be a good symbolic state?

What would be a good symbolic action?

35 of 65

LatPlan [Masataro and Fukanaga ‘18]

What would be a good symbolic state for an image?

Propositional (i.e., Boolean vector)
Encodes relevant information for planning

35

Solution: the State AutoEncoder (SAE)

36 of 65

LatPlan [Masataro and Fukanaga ‘18]

What would be a good symbolic state?

Propositional (i.e., Boolean vector)
Encodes relevant information for planning

36

Solution: the State AutoEncoder (SAE)

DNNs

Latent state representation

= the symbolic state

Use Gamble-Softmax so the latent representation is propositional (Booleans)

37 of 65

LatPlan [Masataro and Fukanaga ‘18]

What would be a good symbolic action?

Encodes the change in a pair of consecutive images
Allows predicting the next state and applicability

37

Solution: the Action AutoEncoder (AAE)

DNNs

Latent action representation

= the symbolic action

Use Gamble-Softmax so the latent representation is categorial (Booleans)

38 of 65

LatPlan [Masataro and Fukanaga ‘18]

What would be a good symbolic representation?

Propositional (i.e., Boolean vector)
Encodes relevant information for planning
Allows predicting the next state and applicability

38

DNN

39 of 65

LatPlan [Masataro and Fukanaga ‘18]

What would be a good symbolic representation?

Propositional (i.e., Boolean vector)
Encodes relevant information for planning
Allows predicting the next state and applicability
Has propositional preconditions and effects

39

40 of 65

Cube-Space AutoEncoder [Asai and Muise‘20]

What would be a good symbolic representation?

Propositional (i.e., Boolean vector)
Encodes relevant information for planning
Allows predicting the next state and applicability
Has propositional preconditions and effects

40

Encodes preconditions and effects “directly”

Many details in the paper

41 of 65

LatPlan and Cube-Space AutoEncoder

Pros:

Learn only from images!
No supervision required
Works reasonably on benchmarks

Discussion:

Losing formal guarantees?
Losing explainability?
Is it practical to not know your agent’s actions?
Is it practical to know every frame?

41

42 of 65

I-ROSAME [Xi et al., ‘24]

Input: symbolic representation, image & action traces
Output: action model + CV model

42

43 of 65

Learning from Low-Level Control

43

A

B

C

Move

(A, B)

A

B

C

A

B

C

Load

(Pkg, C)

Move

(B, C)

A

B

C

Raw set of features

(e.g., sensor data)

Low-level controllers

(AKA Skills, Options, VLA?…)

44 of 65

Learning from Low-Level Control

44

A

B

C

Move

(A, B)

A

B

C

A

B

C

Load

(Pkg, C)

Move

(B, C)

A

B

C

Raw set of features

(e.g., sensor data)

Low-level controllers

(AKA Skills, Options, VLA?…)

45 of 65

Learning from Low-Level Control

45

Move

(A, B)

Load

(Pkg, C)

Move

(B, C)

Raw set of features

(e.g., sensor data)

Low-level controllers

(AKA Skills, Options, VLA?…)

46 of 65

Learning from Low-Level Skills [Konidaris et al. ’18]

46

Many details in the paper

Identify predicates based on affected state variables
Learning for probabilistic planning domains

….

🡺preconditions

🡺effects(*)

47 of 65

Learning from Low-Level Skills [Konidaris et al. ’18]

47

From skills to symbols: Learning symbolic representations for abstract high-level planning. G. Konidaris et al. JAIR 2018

48 of 65

Learning from Low-Level Skills [Konidaris et al. ’18]

48

From skills to symbols: Learning symbolic representations for abstract high-level planning. G. Konidaris et al. JAIR 2018

49 of 65

Predicate Invention for Bi-Level Planning [Silver et al. ’23]

Input: plan traces, and goal predicate symbols
Approach: predicate invention as program synthesis

Synthesize symbols, starting from goal predicates
Learn symbolic action, try to plan for given traces, optimize

How to guide the program synthesis process?

Optimize for plans similar in cost to the given traces
Optimize for plans that were easier to find (A* search nodes)

49

50 of 65

VisualPredicator [Liang et al., 2025]

Input: plan traces+images, goal predicate symbols
Approach: program synthesis + a Vision Language Model

Synthesize symbols also using VLM
Learn symbolic action, try to plan for given traces, optimize

50

51 of 65

VisualPredicator [Liang et al., 2025]

Input: plan traces+images, goal predicate symbols
Approach: program synthesis + a Vision Language Model

Synthesize symbols also using VLM
Learn symbolic action, try to plan for given traces, optimize

51

Strategy #1 (Discrimination)

“Explain why action a worked in state s and failed in s’”

Strategy #2 (Transition Modeling)

“Explain what has changed after doing action a in state s”

Strategy #3 (Unconditional Generation)

“Suggest useful compositions of existing predicates”

52 of 65

Learning From Text

53 of 65

Learning From Text

53

A

B

C

Move

(A, B)

A

B

C

A

B

C

Load

(Pkg, C)

Move

(B, C)

A

B

C

“He drove the truck from A to B”

“Then he drove from B to C”

“Finally, he picked up the package”

Framer (Lindsay et al. ‘17)

Parse sentences to “action templates”
Cluster sentences by similarities
Run LOCM considering each cluster as an action

Ancient Pre-Historical Era

(i.e., pre-LLM)

54 of 65

Learning From Text [Lindsay et al. ’17]

54

55 of 65

Learning From Text [Lindsay et al. ’17]

55

56 of 65

Learning From Text [Lindsay et al. ’17]

56

LOCM

57 of 65

Learning From Text [Lindsay et al. ’17]

57

58 of 65

Creating Planning Domain Models with LLMs

58

ACL 2025

59 of 65

LLMs as Planning Formalizers

59

On the Limits of Language Models as Planning Formalizers (Huang & Zhang, ACL ’25)

Prior work suggests LLM as Formalizers is a more robust approach

(but results are somewhat inconclusive)

60 of 65

How Natural is the Natural Language?

60

On the Limits of Language Models as Planning Formalizers (Huang & Zhang, ACL ’25)

61 of 65

LLMs as Planning Formalizers

Zero shot approaches (Oates et al. ‘24, Zhang et al. ‘24)
Generate multiple candidates and merge (Huang et al., 2024)
Generate-Test-Revise (Kambhampati et al., 2024)

….

61

LLMs as Planning Formalizers: A Survey for Leveraging Large Language Models to Construct Automated Planning Models (Tantakoun et al., ACL ‘25)

62 of 65

Types of Input for Representation Learning

62

A

B

C

Move

(A, B)

A

B

C

A

B

C

Load

(Pkg, C)

Move

(B, C)

A

B

C

A

B

C

Move

(A, B)

A

B

C

A

B

C

Load

(Pkg, C)

Move

(B, C)

A

B

C

Images

Text

Raw set of features

(e.g., sensor data)

Basic set of features

(e.g., goal & init description)

LOCM

SIFT

FRAMER

LLM-based

VisualPredictor

Skills2Symbols

LatPlan

Program

Synthesis

63 of 65

Summary

63

A

B

C

Move

(A, B)

A

B

C

A

B

C

Load

(Pkg, C)

Move

(B, C)

A

B

C

A

B

C

Images

Text

Raw set of features

(e.g., sensor data)

Basic set of features

(e.g., goal & init description)

LOCM

SIFT

VisualPredictor

Skills2Symbols

LatPlan

Program Synthesis

I-ROSAME

LLM-based

64 of 65

What’s the Plan for Today?

64

Next: Active Learning and Open Challenges

Time	Session	Speaker
08:30–09:15	Introduction & Domain Learning Basics	Roni Stern
09:15–10:00	Offline Learning Action Models	Argaman Mordoch
10:00-10:30	Coffee Break ☕
10:30-11.00	Learning State Abstraction	Roni Stern
11.00-11.45	Active Learning and Open Challenges	Argaman Mordoch & Roni Stern
11.45-12.30	Hands-on Session	Leonardo Lamanna

Argaman Mordoch

BGU

65 of 65

Rate This Tutorial @ domain-learning.github.io

65