Overview of our plan so far...
- Use Reinforcement Learning (Q-learning)
- Goal : Learn optimal policy for the snake, avoid walls and the other snake, trap the other snake if possible
- Q gives each action for each state a value that has to be learnt
- Battlesnake Gym : open source environment to train a RL model conforming to the OpenAI Gym interface, specifically made for Battlesnake
(image from DD2380 slides)