JavaScript isn't enabled in your browser, so this file can't be opened. Enable and reload.

1 of 24

ALICE: Aligning Language models with

Interactive Code Execution in game engines

�

Yang Su

Student Researcher @ Cornell & Millennium

This work is supported by

2 of 24

ALICE (alicellm.github.io)

ALICE

ALICE is …

A Meta-Agent Collaboration System

ALICE can …

Turns Voice -> Text -> Code -> Interactive 3D Scenes and Objects
Generates high-quality Data through Multi-turn Interactions and Feedback

ALICE can be used in …

Virtual Reality, Game Engines, Robotics Simulation
Complex Development Cycles, Human-Computer Collaboration
Data-limited Environments and Business Scenarios

3 of 24

Motivation

ALICE

Code generation is hard in virtual environments (robotics simulation, virtual reality, autonomous driving)

Number of tools and interaction category is huge as we require precise control over details
Lack of efficiency for real-time interaction

4 of 24

Motivation & Contribution

ALICE

Code generation is hard in virtual environments (robotics simulation, virtual reality, autonomous driving)

Number of tools and interaction category is huge as we require precise control over details
Lack of efficiency for real-time interaction

We built a Large Language Model (LLM) based system to generate code with

Low-cost function calls by only focus on relevant tools
Parallelizable code execution for efficient interaction
Customizable interaction sets, i.e. user-defined functions

5 of 24

Preliminary Result

User: Select buildings from 6 to 10 meters and make them orange

Executable Code Block

ALICE

6 of 24

Preliminary Result

User: Select buildings from 6 to 10 meters and make them orange

Executable Code Block

ALICE

We call this system Voice2Action

The Voice2Action pipeline is built in a Virtual Reality scene for highly efficient code generation
Reduced code generation cost by calling LLMs in the traditional way 30x more

How efficient?
3k user instructions cost only 5 dollars with OpenAI GPT3.5-Turbo’s API

Core concepts includes…

Code Interpreter
Code Reflection
Runtime Compiling via Assembly References
Multi-Threading
etc.

7 of 24

Voice2Action: code generation in virtual reality

ALICE

8 of 24

Going Forward - The ALICE Project

ALICE

ALICE (Aligning Language models with Interactive Code Execution in game engines)

Human instructions are vague and personalized

“Draw a building” -> what size? location? shape?
Current LLM (autonomous agent) systems often complete the task without seeking user specifications

How to “prompt” the user to make them aware of what could be done in the environment?

Robotics manipulation -> “Give me an apple” -> there is no apple nearby, do I search further or stop?
Just like human vs. human interactions!

We are building an efficient multi-agent system in game engines via code generation

Why multi-agent - personalization
Why game engines - fully controllable environment
Why interactive code generation - flexibility & versatility

9 of 24

ALICE - code generation in game engines

Unity Engine

Unreal Engine

ALICE

Urban Planning

Gaming

VR Workspace

10 of 24

Communication

ALICE

Controller LLM

(Object Creator)

Create / edit direct children controllers (objects)
Set up (new) intention communication routes
Call into intentions

Controller LLM 1-1

Controller LLM 1-2

Controller LLM 1-3

Frozen weights

only prompt is modified

Controller LLM 2-1

Controller LLM 2-2

Controller LLM 2-3

old

new

old

Intent LLM

(Script Creator)

11 of 24

Intent

Intent LLM

(Script Creator)

Create inherited scripts (intent LLMs) - MUST ask controller LLM for communication route
Set up (new) fields and methods (tools)

Fine-tunable weights

Shared weights per script

i.e. each field / function goes into the same model

Format (Json)

Function Call

Augmented Tool

Documentation

Param Config

Function Call

Field Config

i.e. retrieval systems, vision module, etc.

Execution LLM

No modification of existing fields / functions allowed
Create new fields / functions
Compilation and runtime feedback

ALICE

12 of 24

Interactive Code Execution

User: Build

the Grand Canyon in

Arizona, United States

ALICE

User: 建造崇山峻岭

(Chinese)

“Build many mountains with different styles”

N / A

(Through Multi-Turn Conversation)

Ground Truth

(not used in training)

13 of 24

Before vs. After Alignment

ALICE

14 of 24

Controller LLM

(Agent Creator)

Environment Configuration (Long-Term Memory)

User: “draw a vibrant outdoor scene…”

“Sunny weather”

[Parallelized^2]

E::Render()

Documentation

Execution LLM

On Script E

Format (Json)

Function Call

Execution Examples

(Short-Term Memory)

Pass

Fail

Error Trace

“In a sunny day..” [Parallelized]

Intent LLM

On Script E

Augmented Tool

Documentation

Param Config

Function Call

LVM for Feedback

Human vs. GPT4-V

Agreement

Script A

(terrain)

Script E

(skybox)

…

E::Paint

(tree type)

E::Render

(weather type)

…

Intent LLM

(Script Creator)

“No tree..”

-> N/A

“Sunny..”

-> Execute

communicate

. . .

Execution LLM

On Script E

Field Config

communicate

. . .

“A riverside village..”

“In a sunny day..”

Frame updates

15 of 24

Collaborations

ALICE

Past Collaborations

Current Collaborations

Cornell NLP
Millennium - Cornell ICPC Initiative

Future Collaborations

BigCode
Alibaba Qwen Team

16 of 24

Conclusion & Broader Impact

ALICE

Why ALICE is useful?

Data generation with active intervention (wrt. to SORA)
Data generation in data limited environments (virtual reality, robotics simulation, autonomous driving, etc.)

Advantages of ALICE?