Interactive Learning with Grounded Language Agents Utilizing World Models
Arjun V Sudhakar* 1,3
Sai Rajeswar* 2
1 - Mila - Quebec AI Institute
2 - ServiceNow Research
3 - Polytechnique Montreal
Goal
Background
Background:
Yao et al., 2022, NeurIPS
Background:
Yao et al., 2022, NeurIPS
Background:
Yao et al., 2022, NeurIPS
Background:
Motivation
Motivation
Research Question
Experimental Setup
The Hypothesis of Experiment 1 : Try to understand what modality of information the system is majorly relying on to solve the problem
Experimental Setup
The Hypothesis of Experiment 1 : Try to understand what modality of information the system is majorly relying on to solve the problem
Understanding/Learning:
Research Question
Experimental Setup
Note: 5 random seed
The Hypothesis of Experiment 2:
Shuffling the instructions randomly deteriorates the performances?
Can the model still preserve the results irrespective of the syntactic or semantic information?
Experimental Setup
Note: 5 random seed
The Hypothesis of Experiment 2:
Shuffling the instructions randomly deteriorates the performances?
Can the model still preserve the results irrespective of the syntactic or semantic information?
Outcome:
Research Question
Experimental Setup
The Hypothesis of Experiment 3: Rather than just concatenating language and vision representations, can we use FiLM to concatenate the representation?
Experimental Setup
The Hypothesis of Experiment 3: Rather than just concatenating language and vision representations, can we use FiLM to concatenate the representation?
Outcome: Improvement in performance will suggest that some sophisticated method can be used, or else we can focus on other methods for improvement.
Research Question
Experimental Setup
The Hypothesis of Experiment 4: Using a model-based approach to utilize historical data better and plan to yield a better result.
Experimental Setup
The Hypothesis of Experiment 4: Using a model-based approach to utilize historical data better and plan to yield a better result.
Outcome: The model-based will improve the existing baseline, or there will be a drop in performance due to hyperparameter sensitivity or any other factors in modeling the world.
Timeline