Quiz 10 - Muti-Agent System [11/17]
Email *

1. According to Oriol Vinyals, why was Imitation Learning (IL) or pre-training a necessary first step before applying Reinforcement Learning (RL) in the AlphaStar project?

*
1 point
2. What primary issue with standard self-play reinforcement learning did the AlphaStar team solve by introducing the 'League of Exploiters'? *
1 point

3. Vinyals notes that the compute envelope for modern LLMs is heavily skewed toward pre-training (IL). What does he argue this imbalance prevents in the current LLM training paradigm?

*
1 point

4. What is the key difference when applying multi-agent reinforcement learning techniques (like the League) from a competitive game like StarCraft to collaborative real-world LLM agent tasks (e.g., a math tutor)?

*
1 point

5. How did the architecture of AlphaStar's action space relate to modern LLM function calling?

*
1 point
A copy of your responses will be emailed to the address you provided.
Submit
Clear form
reCAPTCHA
This form was created inside of UC Berkeley.

Does this form look suspicious? Report