1 of 51

1

2 of 51

What’s the Plan for Today?

2

Time	Session	Speaker
08:30–09:15	Introduction & Domain Learning Basics	Roni Stern
09:15–10:00	Offline Learning Action Models	Leonardo Lamanna
10:00-10:30	Coffee Break ☕
10:30-11.00	Learning State Abstraction	Roni Stern
11.00-11.45	Active Learning and Open Challenges	Roni Stern
11.45-12.30	Hands-on Session	Leonardo Lamanna

3 of 51

What’s the Plan for Today?

3

Time	Session	Speaker
08:30–09:15	Introduction & Domain Learning Basics	Roni Stern
09:15–10:00	Offline Learning Action Models	Argaman Mordoch
10:00-10:30	Coffee Break ☕
10:30-11.00	Learning State Abstraction	Roni Stern
11.00-11.45	Active Learning and Open Challenges	Argaman Mordoch & Roni Stern
11.45-12.30	Hands-on Session	Leonardo Lamanna

4 of 51

What’s the Plan for Today?

4

Time	Session	Speaker
08:30–09:15	Introduction & Domain Learning Basics	Roni Stern
09:15–10:00	Offline Learning Action Models	Argaman Mordoch
10:00-10:30	Coffee Break ☕
10:30-11.00	Learning State Abstraction	Roni Stern
11.00-11.45	Active Learning and Open Challenges	Argaman Mordoch & Roni Stern
11.45-12.30	Hands-on Session	Leonardo Lamanna

1. What is planning?

Symbolic

Domain -independent

2. Why learn a domain for planning?

3. The domain model learning problem

5 of 51

Domain Model Learning in AI Planning ��Part 1: Introduction & Basics�

Roni Stern

Ben Gurion University of the Negev

6 of 51

Background: Planning

What is Planning?

6

S

A

B

Initial State

Operators

Is goal?

C

D

E

Is goal?

Yes!

A plan!

Goal

7 of 51

Background: Planning

Why Planning?

7

S

A

B

Initial State

Operators

Is goal?

C

D

E

Is goal?

Yes!

Goal

8 of 51

Why Plan if you can Chat?

8

9 of 51

Why Planning?

9

10 of 51

Why Planning?

10

11 of 51

Why Planning?

11

12 of 51

Why Planning?

12

13 of 51

4 months has passed…

(since the AAAI tutorial)

13

14 of 51

Why Planning?

14

15 of 51

Why Planning?

15

16 of 51

Symbolic Planning

The World Still Needs Symbolic Planners

The world is described by symbols (predicates, objects, functions,…)
The world dynamics are well-understood
Planners reason over how actions changes the state

16

Reasoning trace can be explained

Designed for long-term reasoning

Can output

“no solution”

(instead of hallucinating)

Formal validation of each step

17 of 51

Symbolic Planning

The World Still Needs Symbolic Planners

The world is described by symbols (predicates, objects, functions,…)
The world dynamics are well-understood
Planners reason over how actions changes the state

17

Reasoning trace can be explained

Designed for long-term reasoning

Can output

“no solution”

(instead of hallucinating)

Formal validation of each step

18 of 51

Background: Domain-Independent Planning

To build a general planning algorithm

we need a way to explain our problem to the AI

Shakey the robot

19 of 51

Planning Domain-Definition Language (PDDL)

19

domain.pddl

problem.pddl

How to represent a state?
What operators do we have?
How do they work?

What is the initial state?
What is the goal?
(optional) What I want to optimize?

Model of the environment

Model of the current task

One domain to rule them all

20 of 51

PDDL Example: Tower of Hanoi

domain.pddl

(define (domain HANOI)

(:requirements :strips)

(:predicates (on ?x ?y)

(smaller ?x ?y)

(clear ?x))

(:action move

:parameters (?x ?y ?z)

:precondition (and

(on ?x ?y)

(clear ?z) (clear ?x)

(smaller ?x ?z))

:effect

(and (on ?x ?z)

(clear ?y)

(not (on ?x ?y))

(not (clear ?z)))))

http://editor.planning.domains/#

21 of 51

PDDL Example: Tower of Hanoi

domain.pddl

(define (domain HANOI)

(:requirements :strips)

(:predicates (on ?x ?y)

(smaller ?x ?y)

(clear ?x))

(:action move

:parameters (?x ?y ?z)

:precondition (and

(on ?x ?y)

(clear ?z) (clear ?x)

(smaller ?x ?z))

:effect

(and (on ?x ?z)

(clear ?y)

(not (on ?x ?y))

(not (clear ?z)))))

http://editor.planning.domains/#

problem.pddl

(define (problem HANOI-4-0)

(:domain HANOI)

(:objects A B C D E1 E2 E3)

(:init

(ON A B) (ON B C) (ON C D) (ON D E1)

(CLEAR A) (CLEAR E2) (CLEAR E3)

(SMALLER A B) (SMALLER A C) (SMALLER A D)

(SMALLER B C) (SMALLER B D) (SMALLER C D)

(SMALLER A E1) (SMALLER A E2) (SMALLER A E3)

(SMALLER B E1) (SMALLER B E2) (SMALLER B E3)

(SMALLER C E1) (SMALLER C E2) (SMALLER C E3)

(SMALLER D E1) (SMALLER D E2) (SMALLER D E3))

(:goal (AND (ON A B) (ON B C) (ON C D)

(ON D E3))))

22 of 51

Classical Planning and Beyond

Classical planning (PDDL)

Single acting agents
State is a set of predicates (i.e., Booleans)
Deterministic effects
Full observability

Beyond classical planning

Numeric planning (PDDL 2.1)
Temporal planning (PDDL 2.1)
Exogenous events (PDDL 2.1, 3, +)
Multi-agent planning (MA-PDDL)
Probabilistic planning (PPDDL, RDDL)
Contingent planning …

22

23 of 51

The Dream: One Planner to Rule them All

Planning

Problem

Formal Domain+Problem

(STRIPS, PDDL, PDDL+, RDDL, fSTRIPS,…)

AI Planner

Plan

Planning

Problem

How to find a plan in a large state space?

Which formal language to use?

?

Why Planning

is Hard?

24 of 51

Symbolic Planning

The World Still Needs Symbolic Planners

The world is described by symbols (predicates, objects, functions,…)
The world dynamics are well-understood
Planners reason over how actions changes the state

24

Reasoning trace can be explained

Designed for long-term reasoning

Can output

“no solution”

(instead of hallucinating)

Formal validation of each step

25 of 51

Learning Planning Domain Models

Planning

Problem

Formal Domain+Problem

(STRIPS, PDDL, PDDL+, RDDL, fSTRIPS,…)

AI Planner

Plan

Planning

Problem

Planning

Problem

?

Operator

Observation

AI Learner

26 of 51

Learning Planning Domain Models

Planning

Problem

Formal Domain+Problem

(STRIPS, PDDL, PDDL+, RDDL, fSTRIPS,…)

AI Planner

Plan

Planning

Problem

Planning

Problem

?

Operator

Observation

Planning Domain Learner

27 of 51

Learning Planning Domain Models

Planning

Problem

Formal Domain+Problem

(STRIPS, PDDL, PDDL+, RDDL, fSTRIPS,…)

AI Planner

Plan

Planning

Problem

Planning

Problem

Operator

Observation

AI Learner

28 of 51

Learning Planning Domain Models

Planning

Problem

Formal Domain+Problem

(STRIPS, PDDL, PDDL+, RDDL, fSTRIPS,…)

AI Planner

Plan

Planning

Problem

Planning

Problem

Operator

Observation

Planning Domain Learner

29 of 51

Why Plan if you can just Act?

29

30 of 51

Planning vs. Reinforcement Learning (RL)

Symbolic Planning	Reinforcement Learning
Achieve goal, minimize cost	Maximize reward
Declaratively described state space, requires strong assumptions	Unstructured state space, few assumptions needed
Explicitly designed to exploit domain structure (predicates, signatures, etc.)	Implicit use of domains structure (learned online)
Offline planning, no errors ;)	Online, requires trial-and-error
No training, long, problem-specific planning	Long, problem-specific, training
Formal guarantees on solution quality	Hard to obtain guarantees on solution

Offline RL

Meta RL

RL+shielding

DNN Verifiers

31 of 51

Why Planning?

31

32 of 51

Why Planning?

32

33 of 51

Why Planning?

33

34 of 51

Why Planning?

34

35 of 51

Why Learn a Planning Domain

instead of just Learning to Act?

35

36 of 51

Learning Planning Domain vs. RL

Learning Planning Domains	Reinforcement Learning
Learning Planning Domains	Model-Based	Model-Free
?	Fewer than model-free	Usually data-intensive
?	Runtime scales well with data
Produces reusable composable domain model that planners can exploit	Produces policy optimizing a fixed reward (can retrain offline)	Produces policy optimizing a fixed reward

Lifelong RL

Transfer learning

Offline RL

Meta RL

RL+shielding

DNN Verifiers

Learning symbolic planning can be viewed as a special case of MBRL

37 of 51

Why Learn a Symbolic Planning?

Powerful domain-independent planners

Search algorithms
Heuristics
Symmetry detection

…

Formal solution quality guarantees
Easier to explain
Easier to diagnose and debug

37

See more on this

Vallati et al. ‘25

38 of 51

Learning Planning Domain Models

Planning

Problem

Formal Domain+Problem

(STRIPS, PDDL, PDDL+, RDDL, fSTRIPS,…)

AI Planner

Plan

Planning

Problem

Planning

Problem

Operator

Observation

AI Learner

FAMA (Aineto et al.),

ARMS (Yang et al.), �LOCM (Cresswell et al.),

NOLAM (Lamanna & Serafini)

….

Deep RL

(e.g., PPO, DQN,…)

can do it better!

Learning the domain is too hard

39 of 51

Learning Planning Domain Models?

39

Deep RL

(e.g., PPO, DQN,…)

can do it better!

Learning the domain is too hard

Is it really?

How far can we go?

40 of 51

What’s the Plan for Today?

40

1. What is planning?

Symbolic

Domain -independent

2. Why learn a domain for planning?

3. The domain model learning problem

Time	Session	Speaker
08:30–09:15	Introduction & Domain Learning Basics	Roni Stern
09:15–10:00	Offline Learning Action Models	Argaman Mordoch
10:00-10:30	Coffee Break ☕
10:30-11.00	Learning State Abstraction	Roni Stern
11.00-11.45	Active Learning and Open Challenges	Argaman Mordoch & Roni Stern
11.45-12.30	Hands-on Session	Leonardo Lamanna

41 of 51

Types of Domain Model Learning Problems

41

domain.pddl

(define (domain HANOI)

(:requirements :strips)

(:predicates (on ?x ?y) (smaller ?x ?y) (clear ?x))

(:action move

:parameters (?x ?y ?z)

:precondition

(and (on ?x ?y) (clear ?z)

(clear ?x) (smaller ?x ?z))

:effect

(and (on ?x ?z) (clear ?y)

(not (on ?x ?y)) (not (clear ?z)))))

Representation Learning

Action Model Learning

42 of 51

Offline vs. Online Domain Model Learning

42

Observations

Action

State

Action

State

Learning Agent

External Agent

Choose actions
Process observations
Learn/update domain model

Process observation
Learn domain model

Learning Agent

Offline

Learning

Online

Learning

Active

Learning

43 of 51

Types of Domain Model Learning Problems

Assumption about the environment

Deterministic? single-agent? propositional? …

What is the objective?

Learn an accurate model?
Learn a useful model? Useful for what?
How to measure success? (more on this later)

Assumptions over the given observations

43

44 of 51

What Can We Observe?

44

Observations

Action

State

Learning Agent

External Agent

Process observation
Learn domain model

Offline

Learning

Move

(A, B)

Load

(Pkg, C)

Move

(B, C)

A

B

C

A

B

C

A

B

C

A

B

C

AKA: Trace, trajectory, episode, execution, transitions ….

45 of 51

Partial Observability

45

Observations

Action

State

Learning Agent

External Agent

Process observation
Learn domain model

Offline

Learning

A

B

C

Move

(A, B)

A

B

C

A

B

C

Load

(Pkg, C)

Move

(B, C)

A

B

C

46 of 51

Partial Observability

46

Observations

Action

State

Learning Agent

External Agent

Process observation
Learn domain model

Offline

Learning

A

B

C

Move

(A, B)

A

B

C

A

B

C

Load

(Pkg, C)

Move

(B, C)

A

B

C

47 of 51

Partial Observability

47

Observations

Action

State

Learning Agent

External Agent

Process observation
Learn domain model

Offline

Learning

A

B

C

Move

(A, B)

A

B

C

A

B

C

Load

(Pkg, C)

Move

(B, C)

A

B

C

48 of 51

Noisy Observations

48

Observations

Action

State

Learning Agent

External Agent

Process observation
Learn domain model

Offline

Learning

A

B

C

Move

(A, B)

A

B

C

A

B

C

Load

(Pkg, C)

Move

(B, C)

A

B

C

49 of 51

Raw Observations

49

Observations

Action

State

Learning Agent

External Agent

Process observation
Learn domain model

Offline

Learning

A

B

C

Move

(A, B)

A

B

C

A

B

C

Load

(Pkg, C)

Move

(B, C)

A

B

C

Images?

Text?

Sensor data?

50 of 51

What’s the Plan for Today?

50

Next: Offline Learning of Action Models

Time	Session	Speaker
08:30–09:15	Introduction & Domain Learning Basics	Roni Stern
09:15–10:00	Offline Learning Action Models	Argaman Mordoch
10:00-10:30	Coffee Break ☕
10:30-11.00	Learning State Abstraction	Roni Stern
11.00-11.45	Active Learning and Open Challenges	Argaman Mordoch & Roni Stern
11.45-12.30	Hands-on Session	Leonardo Lamanna

Argaman Mordoch

BGU

51 of 51

Rate This Tutorial @ domain-learning.github.io

51