1 of 42

1

Domain Model Learning for AI Planning

Tutorial in AAAI 2026

2 of 42

What’s the Plan for Today?

2

Time	Session	Speaker
08:30–09:15	Introduction & Domain Learning Basics	Roni Stern
09:15–09:45	Learning State Abstractions	Roni Stern
09:45–10:30	Offline Learning Domain Models	Leonardo Lamanna
10:30–11:00	Coffee Break
11:00–11:45	Hands-on Session	Leonardo Lamanna
11:45–12:30	Online Learning and Open Challenges	Roni Stern

3 of 42

What’s the Plan for Today?

3

Time	Session	Speaker
08:30–09:15	Introduction & Domain Learning Basics	Roni Stern
09:15–09:45	Learning State Abstractions	Roni Stern
09:45–10:30	Offline Learning Domain Models	Leonardo Lamanna
10:30–11:00	Coffee Break
11:00–11:45	Hands-on Session	Leonardo Lamanna
11:45–12:30	Online Learning and Open Challenges	Roni Stern

1. What is planning?

Symbolic

Domain -independent

2. Why learn a domain for planning?

3. The domain model learning problem

4 of 42

Domain Model Learning in AI Planning ��Part 1: Introduction & Basics�

Roni Stern

Ben Gurion University of the Negev

5 of 42

Background: Planning

What is Planning?

5

S

A

B

Initial State

Operators

Is goal?

C

D

E

Is goal?

Yes!

A plan!

Goal

6 of 42

Background: Planning

Why Planning?

6

S

A

B

Initial State

Operators

Is goal?

C

D

E

Is goal?

Yes!

Goal

7 of 42

Why Plan if you can Chat?

7

8 of 42

Why Planning?

8

9 of 42

Why Planning?

9

10 of 42

Why Planning?

10

11 of 42

Why Planning?

11

12 of 42

Background: Planning

Why Planning

is Hard?

12

S

A

B

Initial State

Operators

Is goal?

C

D

E

Is goal?

Yes!

A plan!

Goal

Key challenge in planning: Combinatorial search

13 of 42

Background: Domain-Independent Planning

To build a general planning algorithm

we need a way to explain our problem to the AI

Shakey the robot

14 of 42

Planning Domain-Definition Language (PDDL)

14

domain.pddl

problem.pddl

How to represent a state?
What operators do we have?
How do they work?

What is the initial state?
What is the goal?
(optional) What I want to optimize?

Model of the environment

Model of the current task

One domain to rule them all

15 of 42

PDDL Example: Tower of Hanoi

domain.pddl

(define (domain HANOI)

(:requirements :strips)

(:predicates (on ?x ?y)

(smaller ?x ?y)

(clear ?x))

(:action move

:parameters (?x ?y ?z)

:precondition (and

(on ?x ?y)

(clear ?z) (clear ?x)

(smaller ?x ?z))

:effect

(and (on ?x ?z)

(clear ?y)

(not (on ?x ?y))

(not (clear ?z)))))

http://editor.planning.domains/#

16 of 42

PDDL Example: Tower of Hanoi

domain.pddl

(define (domain HANOI)

(:requirements :strips)

(:predicates (on ?x ?y)

(smaller ?x ?y)

(clear ?x))

(:action move

:parameters (?x ?y ?z)

:precondition (and

(on ?x ?y)

(clear ?z) (clear ?x)

(smaller ?x ?z))

:effect

(and (on ?x ?z)

(clear ?y)

(not (on ?x ?y))

(not (clear ?z)))))

http://editor.planning.domains/#

problem.pddl

(define (problem HANOI-4-0)

(:domain HANOI)

(:objects A B C D E1 E2 E3)

(:init

(ON A B) (ON B C) (ON C D) (ON D E1)

(CLEAR A) (CLEAR E2) (CLEAR E3)

(SMALLER A B) (SMALLER A C) (SMALLER A D)

(SMALLER B C) (SMALLER B D) (SMALLER C D)

(SMALLER A E1) (SMALLER A E2) (SMALLER A E3)

(SMALLER B E1) (SMALLER B E2) (SMALLER B E3)

(SMALLER C E1) (SMALLER C E2) (SMALLER C E3)

(SMALLER D E1) (SMALLER D E2) (SMALLER D E3))

(:goal (AND (ON A B) (ON B C) (ON C D)

(ON D E3))))

17 of 42

Classical Planning and Beyond

Classical planning (PDDL)

Single acting agents
State is a set of predicates (i.e., Booleans)
Deterministic effects
Full observability

Beyond classical planning

Numeric planning (PDDL 2.1)
Temporal planning (PDDL 2.1)
Exogenous events (PDDL 2.1, 3, +)
Multi-agent planning (MA-PDDL)
Probabilistic planning (PPDDL, RDDL)
Contingent planning …

17

18 of 42

The Dream: One Planner to Rule them All

18

Planning

Problem

Formal Domain+Problem

(STRIPS, PDDL, PDDL+, RDDL, fSTRIPS,…)

AI Planner

Plan

How to find a plan in a large state space?

Which formal language to use?

Planning

Problem

?

19 of 42

Learning Planning Domain Models

Planning

Problem

Formal Domain+Problem

(STRIPS, PDDL, PDDL+, RDDL, fSTRIPS,…)

AI Planner

Plan

Planning

Problem

Planning

Problem

?

Operator

Observation

AI Learner

20 of 42

Learning Planning Domain Models

Planning

Problem

Formal Domain+Problem

(STRIPS, PDDL, PDDL+, RDDL, fSTRIPS,…)

AI Planner

Plan

Planning

Problem

Planning

Problem

?

Operator

Observation

Planning Domain Learner

21 of 42

Why Plan if you can just Act?

21

22 of 42

Planning vs. Reinforcement Learning (RL)

Symbolic Planning	Reinforcement Learning
Achieve goal, minimize cost	Maximize reward
Declaratively described state space, requires strong assumptions	Unstructured state space, few assumptions needed
Explicitly designed to exploit domain structure (predicates, signatures, etc.)	Implicit use of domains structure (learned online)
Offline planning, no errors ;)	Online, requires trial-and-error
No training, long, problem-specific planning	Long, problem-specific, training
Formal guarantees on solution quality	Hard to obtain guarantees on solution

Offline RL

Meta RL

RL+shielding

DNN Verifiers

23 of 42

Why Planning?

23

24 of 42

Why Planning?

24

25 of 42

Why Planning?

25

26 of 42

Why Planning?

26

27 of 42

Why Learn a Planning Domain

instead of just Learning to Act?

27

28 of 42

Learning Planning Domain vs. RL

Learning Planning Domains	Reinforcement Learning
Learning Planning Domains	Model-Based	Model-Free
?	Fewer than model-free	Usually data-intensive
?	Runtime scales well with data
Produces reusable composable domain model that planners can exploit	Produces policy optimizing a fixed reward (can retrain offline)	Produces policy optimizing a fixed reward

Lifelong RL

Transfer learning

Offline RL

Meta RL

RL+shielding

DNN Verifiers

Learning symbolic planning can be viewed as a special case of MBRL

29 of 42

Why Learn a Symbolic Planning?

Powerful domain-independent planners

Search algorithms
Heuristics
Symmetry detection

…

Formal solution quality guarantees
Easier to explain
Easier to diagnose and debug

29

See more on this

Vallati et al. ‘25

30 of 42

Learning Planning Domain Models

Planning

Problem

Formal Domain+Problem

(STRIPS, PDDL, PDDL+, RDDL, fSTRIPS,…)

AI Planner

Plan

Planning

Problem

Planning

Problem

Operator

Observation

AI Learner

FAMA (Aineto et al.),

ARMS (Yang et al.), �LOCM (Cresswell et al.),

NOLAM (Lamanna & Serafini)

….

Deep RL

(e.g., PPO, DQN,…)

can do it better!

Learning the domain is too hard

31 of 42

Learning Planning Domain Models?

31

Deep RL

(e.g., PPO, DQN,…)

can do it better!

Learning the domain is too hard

Is it really?

How far can we go?

32 of 42

What’s the Plan for Today?

32

Time	Session	Speaker
08:30–09:15	Introduction & Domain Learning Basics	Roni Stern
09:15–09:45	Learning State Abstractions	Roni Stern
09:45–10:30	Offline Learning Domain Models	Leonardo Lamanna
10:30–11:00	Coffee Break
11:00–11:45	Hands-on Session	Leonardo Lamanna
11:45–12:30	Online Learning and Open Challenges	Roni Stern

1. What is planning?

Symbolic

Domain -independent

2. Why learn a domain for planning?

3. The domain model learning problem

33 of 42

Offline vs. Online Domain Model Learning

33

Observations

Action

State

Action

State

Learning Agent

External Agent

Choose actions
Process observations
Learn/update domain model

Process observation
Learn domain model

Learning Agent

Offline

Learning

Online

Learning

Active

Learning

34 of 42

Types of Domain Model Learning Problems

Assumption about the environment

Deterministic? single-agent? propositional? …

What is the objective?

Learn an accurate model?
Learn a useful model? Useful for what?
How to measure success? (more on this later)

Assumptions over the given observations

34

35 of 42

What Can We Observe?

35

Observations

Action

State

Learning Agent

External Agent

Process observation
Learn domain model

Offline

Learning

Move

(A, B)

Load

(Pkg, C)

Move

(B, C)

A

B

C

A

B

C

A

B

C

A

B

C

AKA: Trace, trajectory, episode, execution, transitions ….

36 of 42

Partial Observability

36

Observations

Action

State

Learning Agent

External Agent

Process observation
Learn domain model

Offline

Learning

A

B

C

Move

(A, B)

A

B

C

A

B

C

Load

(Pkg, C)

Move

(B, C)

A

B

C

37 of 42

Partial Observability

37

Observations

Action

State

Learning Agent

External Agent

Process observation
Learn domain model

Offline

Learning

A

B

C

Move

(A, B)

A

B

C

A

B

C

Load

(Pkg, C)

Move

(B, C)

A

B

C

38 of 42

Partial Observability

38

Observations

Action

State

Learning Agent

External Agent

Process observation
Learn domain model

Offline

Learning

A

B

C

Move

(A, B)

A

B

C

A

B

C

Load

(Pkg, C)

Move

(B, C)

A

B

C

39 of 42

Noisy Observations

39

Observations

Action

State

Learning Agent

External Agent

Process observation
Learn domain model

Offline

Learning

A

B

C

Move

(A, B)

A

B

C

A

B

C

Load

(Pkg, C)

Move

(B, C)

A

B

C

40 of 42

Raw Observations

40

Observations

Action

State

Learning Agent

External Agent

Process observation
Learn domain model

Offline

Learning

A

B

C

Move

(A, B)

A

B

C

A

B

C

Load

(Pkg, C)

Move

(B, C)

A

B

C

Images?

Text?

Sensor data?

41 of 42

Types of Domain Model Learning Problems

41

domain.pddl

(define (domain HANOI)

(:requirements :strips)

(:predicates (on ?x ?y) (smaller ?x ?y) (clear ?x))

(:action move

:parameters (?x ?y ?z)

:precondition

(and (on ?x ?y) (clear ?z)

(clear ?x) (smaller ?x ?z))

:effect

(and (on ?x ?z) (clear ?y)

(not (on ?x ?y)) (not (clear ?z)))))

Representation Learning

Action Model Learning

42 of 42

What’s Next

42

Time	Session	Speaker
08:30–09:15	Introduction & Domain Learning Basics	Roni Stern
09:15–09:45	Learning State Abstractions	Roni Stern
09:45–10:30	Offline Learning of Action Models	Leonardo Lamanna
10:30–11:00	Coffee Break
11:00–11:45	Hands-on Session	Leonardo Lamanna
11:45–12:30	Online Learning and Open Challenges	Roni Stern