NAIL122 AI for Games�The AI of RTS games��Peter Guba, Jakub Gemrot
Faculty of Mathematics and Physics
Charles University
What are RTS games?
RTS = Real-Time Strategy
Generally any game that is played in real-time and involves strategizing
More narrow meaning – games like Starcraft, Age of Empires, Command & Conquer
What are RTS games?
RTS = Real-Time Strategy
Generally any game that is played in real-time and involves strategizing
More narrow meaning – games like Starcraft, Age of Empires, Command & Conquer
This is what we’re going to take a look at today.
If you want to give RTS AI a try
Student StarCraft AI Tournament
AIIDE StarCraft AI Competition
CIG StarCraft AI Competition
https://store.steampowered.com/app/464350/Screeps_World/
What makes these games difficult?
Standard techniques
Most commercial titles use FSMs, Hierarchical FSMs or Behaviour Trees
Standard techniques
Most commercial titles use FSMs, Hierarchical FSMs or Behaviour Trees
Standard techniques
Most commercial titles use FSMs, Hierarchical FSMs or Behaviour Trees
These can get quite complicated
Usually don’t provide a challenge to even moderately skilled players
Advanced architecture examples
Plan
Terrain analysis
How to represent the terrain?
First step – partitioning the map into relevant regions
Can be done by hand or using an algorithm
How to represent the terrain?
Useful for:
How to represent the terrain?
Cul-de-sac = dead end
Chokepoint = narrow places connecting two or more regions
We can use our abstraction to find these
How to represent the terrain?
Cul-de-sac = dead end
Chokepoint = narrow places connecting two or more regions
We can use our abstraction to find these
Region that you can only get to from one other region.
How to represent the terrain?
Cul-de-sac = dead end
Chokepoint = narrow places connecting two or more regions
We can use our abstraction to find these
One approach - filter out regions that aren’t narrow, then for each remaining one take all of its adjacent regions and start BFS from them while excluding the candidate region. To optimize this, the BFS algorithm can be cut short.
How to represent the terrain?
Cul-de-sac = dead end
Chokepoint = narrow places connecting two or more regions
We can use our abstraction to find these
Then, we can use them – important buildings can be placed in cul-de-sacs, chokepoints are good points for ambushes, or they can be efficiently guarded
How to represent the terrain?
We can combine our regions with influence maps and potential fields to help guide our units
How to represent the terrain?
Influence maps
Each unit has some influence
This propagates outwards
Influence of allies is additive, influence of enemies subtractive
How to represent the terrain?
Influence maps
Can be used to make decisions about where to defend/attack
Can also lead to interesting emergent behaviours
Can easily be used to encourage cooperation between two or more AI players
How to represent the terrain?
Potential fields
Fields of forces that attract/repulse units
Can be computed dynamically based on enemy unit positions or other criteria
Unit tactics
What to do in a fight?
Influence maps and potential fields can help with reasonable pathfinding
They can also help units get into good formations
What to do once we get into a fight?
What to do in a fight?
Simplest solution – fully scripted behaviours
Can achieve ok results (especially when based on tactics used by human players)
Examples
Random Attack-Closest
[RND]
[AC]
Attack-Weakest [AW]
Kiting
Attack-Value
[Kit]
[AV]
Picks a legal move with uniform prob. distribution
Dtto AC, but: 1. attacks weakest in range
Dtto AC, but: 2. move away from closest enemy Dtto AC, but: 1. attacks 𝑢 w/ highest 𝑢. 𝑑𝑝𝑓/ 𝑢. ℎ𝑝
No-OverKill-Attack-Value
[NOK-AV]
Dtto AV, but: will no try to attack unit that was already
dealt with lethal damage
Kiting-AV [Kit-AV] Dtto Kit, but: 1. attacks 𝑢 w/ highest 𝑢. 𝑑𝑝𝑓/ 𝑢. ℎ𝑝
Scripted strategy w/ script X: assign script X to all units a player controls
What to do in a fight?
What to do in a fight?
Simplest solution – fully scripted behaviours
Can achieve ok results (especially when based on tactics used by human players)
Can be combined using FSMs for less predictability
More advanced solution – using some script assignment/planning algorithm
What to do in a fight?
Portfolio Greedy Search (PGS)
Use set of scripts
Try to find a good assignments to units using greedy approach, by simulating possible exploitation by the opponent
Portfolio Greedy Search�a.k.a. PGS
Portfolio Greedy Search�a.k.a. PGS
Portfolio Greedy Search�a.k.a. PGS
PortfolioGreedySearch sections:
Row 7 <-> [A] … default enemy
Row 8 <-> [B] … seed yourself
Row 9 <-> [C] … seed enemy
Rows 10-13 <-> [D] … improve
Portfolio Greedy Search�a.k.a. PGS
Portfolio Greedy Search�a.k.a. PGS
Portfolio Greedy Search�a.k.a. PGS
P1
P2
P2
P2
What to do in a fight?
Nested Greedy Search (NGS)
Like PGS, but adds one level of recursion
For every assignment that a given player tries, the opponent can change their assignment once
Nested Greedy Search�a.k.a. NGS
Nested Greedy Search�a.k.a. NGS
BTW, notation could not have been more confusing here…
NestedGreedySearch sections:
Row 1,3 <-> [A] … default me
Row 2,4 <-> [B] … default enemy
Row 5 <-> [C] … while we have time
Rows 6-10 <-> [C1] … check on reassignments
Row 9 <-> [C2] … should we reassign?
Nested Greedy Search�a.k.a. NGS
Nested Greedy Search�a.k.a. NGS
Nested Greedy Search�a.k.a. NGS
NestedGreedySearch notes:�Having GS on the second ply of a game tree effectively means, that we move non-convergence down by one ply.
What about proper planning?
Using proper planning algorithms based on tree search
How to deal with simultaneous and durative actions though?
What about proper planning?
Alpha Beta Considering Durations (ABCD)
Adaptation of Alpha Beta Search to games with simultaneous and durative moves
Takes durative actions into account by always rolling the game state forward until at least one player can do something
Splits simultaneous move nodes into two – FIRST and SECOND – and their moves get applied together
What about proper planning?
Alpha Beta Considering Durations (ABCD)
Adaptation of Alpha Beta Search to games with simultaneous and durative moves
Takes durative actions into account by always rolling the game state forward until at least one player can do something
Splits simultaneous move nodes into two – FIRST and SECOND – and their moves get applied together
This is NOT a theoretically sound model for simultaneous move games!
What about proper planning?
Alpha Beta Considering Durations (ABCD)
Adaptation of Alpha Beta Search to games with simultaneous and durative moves
Takes durative actions into account by always rolling the game state forward until at least one player can do something
Splits simultaneous move nodes into two – FIRST and SECOND – and their moves get applied together
Different strategies for picking who goes first – alternating, random, 1-2-2-1 alternation…
This is NOT a theoretically sound model for simultaneous move games!
s – state
d – depth to search
m0 – delayed action effect used for simultaneous nodes
𝜶, 𝜷 – bounds
Meant to be used with iterative deepening.
First, mind the real-time constraints. Note that this ABCD should be run in “iterative deepening” manner, thus timeout
means “do not use this
result at all”.
If we are in terminal node (either depth reaches zero or maximal time for scenario is reached), we return valuation of the state.
This line condense a lot
of stuff.
[A] If none unit can perform actions
(different then “pass”), advance the time to the point some unit may perform an action first.
This line condense a lot of stuff.
[B] Given the current state, i.e., state of units, determine which players
can move.
If at this stage, we are in simultaneous node, use “policy” to determine,
which player will make its
decision first.
Next, we are iterate over moves player “toMove” can do, save current
batch of moves we’re going to investigate into m.
If we are in simultaneous node, and there is nothing in m0 buffer, i.e., this ABCD has not been called from simultaneous node …
… then we continue with next ply of ABCD, but (!) passing actions “m” as an argument.
This “m” will act as m0 in next invocation, so we will not get into this branch next time.
Additionally, if we are near the end of the search depth, we do not bother resolving simultaneous node as
we’re terminating
anyway.
The else branch is then about solving the
“delayed action” effect.
First, this version of ABCD is not using reversible action. So even though we are performing DFS, we clone the state.
Then, we are solving the delayed action effect, if m0 is containing some actions, we apply them here …
… before applying currently selected actions.
The rest of the algorithm is standard alpha-beta prunning.
The state evaluation function is used here to evaluate nodes at certain depths.
LTD3(s): playout; instead of evaluation finish the game by performing a playout, which means, on each unit decision point, use preselected script to select an action. Play until either or both sides are annihilated.��Other possibilities, e.g. Lanchester’s Attrition Laws
LTD3(s): playout; instead of evaluation finish the game by performing a playout, which means, on each unit decision point, use preselected script to select an action. Play until either or both sides are annihilated.��Other possibilities, e.g. Lanchester’s Attrition Laws
Hit Points
Damage Per Frame
LTD3(s): playout; instead of evaluation finish the game by performing a playout, which means, on each unit decision point, use preselected script to select an action. Play until either or both sides are annihilated.��Other possibilities, e.g. Lanchester’s Attrition Laws
Makes having more units with lower HP better than having fewer units with a lot of HP.
If move ordering is done right, it improves the effect of alpha-beta pruning as better state values are found faster.
As we are running ABCD using iterative deepening, we can store information about promising moves from previous runs. This can be used to sort actions in consecutives ABCD invocations, allowing it to run faster. If no such information is available, we can still use a script to suggest “first move”.
What about proper planning?
UCT Considering Durations (UCTCD)
Adaptation of MCTS to games with simultaneous and durative moves
Uses the same techniques as ABCD
UCT Considering Durations�a.k.a. UCTCD
Top-level of MCTS, limited both by max number of iterations and time.
MCTS iteration == Traverse.
UCT Considering Durations�a.k.a. UCTCD
To save memory, only save actions in the nodes and clone the current state at the beginning of each iteration (and then apply the actions to it).
UCT Considering Durations�a.k.a. UCTCD
SelectNode implements MCTS tree-policy using standard UCB a.k.a. UCB-1.
UCT Considering Durations�a.k.a. UCTCD
1. UpdateState is used to apply the moves stored in nodes to the (cloned) state.
2. Each node is also labeled as:�FIRST – simultaneous node, first player to decide�SECOND – simultaneous node, second player to decide
3. SOLO node – only one player can move;
Sequence of SOLO node moves is aggregated within single node move of corresponding player.
UCT Considering Durations�a.k.a. UCTCD
Traverse is formulated as a recursive procedure (row 19) and contains the four steps of the classic MCTS loop (albeit in a strange order).
UCT Considering Durations�a.k.a. UCTCD
1. Selection
2. Expansion
3. Simulation
4. Backpropagation
UCT Considering Durations�a.k.a. UCTCD
9-11:�We’re at leaf we visited for the first time; do the move (10) and perform playout (11).
UCT Considering Durations�a.k.a. UCTCD
Otherwise, update the state with the move, and…
UCT Considering Durations�a.k.a. UCTCD
14-15:�For terminal node, just return the value of the node.
UCT Considering Durations�a.k.a. UCTCD
17-19:�Otherwise generate all the children (17-18) and continue traversal (19).
UCT Considering Durations�a.k.a. UCTCD
20-21:�Update values of the current node.
UCT Considering Durations�a.k.a. UCTCD
Open questions
What to do in a fight?
Any purely tree-search-based approach will probably not be able to deal with the branching factor in medium or large fights
We can remedy this by restricting the set of possible actions
What to do in a fight?
Grouping units
Can be done using several criteria
What to do in a fight?
Neural networks
Have been applied in this area – Supreme Commander 2 (SC2), Age of Empires IV
In SC2:
What to do in a fight?
Neural networks
Possibly difficult design to produce good results and difficult to adjust and fine-tune
Resulting behaviour can however be adjusted by altering inputs or post-processing outputs
The results can be strange and hard to understand for humans, which can be an issue
Scouting
How to gather and use intel?
Two questions:
How to gather and use intel?
How to gather information?
Create one or more scouting units that will be sent to scour the map
In some games, enemy start positions are predefined -> search them one by one
Otherwise generate candidate positions based on utility measures (distance, new area that would become visible, etc.)
How to gather and use intel?
How to gather information?
Once enemies are found, potential fields can be used to both attract the scout to the enemy to get a better look and repel it to avoid being caught
How to gather and use intel?
How to use the gathered information?
Some work exists on trying to estimate enemy strategy and strength based on Bayesian inference
Simpler approach – hardcode strategy estimates based on observed buildings
How to gather and use intel?
How to use the gathered information?
Some work exists on trying to estimate enemy strategy and strength based on Bayesian inference
Simpler approach – hardcode strategy estimates based on observed buildings
Implemented for Starcraft bot ICEBot, supposedly outperforms Bayesian approaches.
How to gather and use intel?
How to use the gathered information?
Can also be used to decide attack timing
If our army size is more than c * estimated enemy army size, attack
Building placement
Where to build?
Constraints:
Where to build?
Building spaces can be pre-defined, but this would probably be rather time-consuming
Starcraft example:
Resource management
How to manage our resources?
Resources are needed to:
How to allocate resources to these different tasks?
How to manage our resources?
Often not addressed very thoroughly
Easiest solutions – first come first served/predefined partitioning
It is theoretically possible to do some planning based on predefined priorities
Overall strategy
How to put it all together?
Which buildings to build, units to create, upgrades to research?
How to make use of scouting data?
Where to send our units?
When to attack the enemy?
Advanced architecture examples
Advanced architecture examples
All of these are scripted at the strategy level.
How to put it all together?
Easiest solution – designer defined strategies
Can achieve good results when used in conjunction with capable systems
Notable example – Halo Wars 2 => one of the designers was a pro RTS player who helped create good micro and macro strategies based on how humans play
How to put it all together?
Useful tools for more complex systems:
How to put it all together?
Useful tools for more complex systems:
Already explained during the F.E.A.R. lecture.
How to put it all together?
Useful tools for more complex systems:
Also mostly explained. Possible tweaks – discard opponent nodes, make the search hierarchical.
How to put it all together?
Useful tools for more complex systems:
NAIL071 – Planning and scheduling course taught by Roman Barták.
Possible use – deciding what units to create in order to counter the enemy’s units.
How to put it all together?
Useful tools for more complex systems:
Can be used to handle specific tasks, e.g. state evaluation (in conjunction with some algorithm that necessitates it).
How to put it all together?
HTN – Hierarchical Task Network
Somewhat commonly used algorithm in game AI (for example in Transformers: Fall of Cybertron or Horizon Zero Dawn)
Designer-defined hierarchy of tasks
Each task is either primitive or compound
Compound tasks have (multiple) ways of being partitioned into other tasks
How to put it all together?
HTN – Hierarchical Task Network
Each task has preconditions and primitive ones also have effects
Different decompositions of compound tasks are tried until an applicable plan is found
How to put it all together?
AHTN – Adversarial Hierarchical Task Network
Standard HTNs don’t simulate the opponent
AHTNs combine HTNs with minimax
Tasks for the first player are decomposed until a primitive action is found, then the algorithm switches to the other player
This assumes turn-based, perfect information games
Cheating
When and how to bend the rules?
Topics we covered:
When and how to bend the rules?
Where you can cheat
Real-world example: Total War
Game basics
Combines real-time battles with turn-based strategy
Turn-based part – the player moves and recruits armies, decides where to attack, conducts diplomacy, etc.
Real-time part – the player fights battles with fixed armies (=> no resource management) and no fog of war
AI basics
AI basics
AI systems always make decisions in a specific order.
AI basics
The two systems here run in parallel.
AI basics
Decides whether to attack or defend and assigns objectives.
AI basics
Tactics are FSMs. They usually represent manoeuvres, such as flanking.
AI basics
Attack Manager commands override tactics. It directs units once they actually get into a battle.
Shogun: Total War (2000)
Individual units (supposedly) controlled by neural networks (NNs)
As NNs aren’t good at solving multiple conflicting objectives, each unit has multiple (for moving formation, avoiding attack, etc.)
Battle AI was inspired by The Art of War => it had many hand-crafter rules based on the book
Diplomatic AI based on FSMs with some genetic algorithms to create individualised personalities
Empire: Total War (2009)
Notable changes to the setting and gameplay (for example naval battles were added)
Complete AI overhaul
Campaign AI now separate from the game => it has the same controls and access to the same information as the player
GOAP used for both campaign and battle AI
Very ambitious, but didn’t work all that well
Total War: Rome II (2013)
MCTS integrated into campaign AI
Multiple different task generators generate tasks, MCTS is then used twice – first to allocate resources, then again to move units
Even with optimizations, it can reportedly only look one step ahead before starting playouts
Total War: Atilla (2015)
Further optimizations made to MCTS to speed up its decision-making (avoiding battles it was unlikely to win, ordering of moves imposed to prevent duplicate strategies…)
Influence maps were used to give the AI an idea of which regions are in danger
Diplomacy AI also improved -> more detailed deal evaluation and generation, each faction has a set of data-driven personalities
Faction personalities change over time to account for evolving game state and changes in leadership
Total War: Warhammer (2016)
Siege battle AI improvements
Emphasis on historical accuracy in earlier entries => bombard the walls then storm the city
The AI was implemented by simple FSMs
Changes made in Warhammer (partially necessitated by change in units)
Total War: Warhammer
AI uses lanes to determine points of attack
For each point it picks a tactic (storming the gates, breaching the walls or climbing them)
Units try to carry it out -> if successful, an entry point is created, if not, the remaining units are moved to reserves
Units in reserve can be deployed again somewhere else
Total War: Warhammer
Once units get inside the city a different AI system takes over
Multiple influence maps are used to inform units where to go (with info such as enemy influence, strategic value, exposure to missile fire, etc.)
Resources
Ontanón, S., Synnaeve, G., Uriarte, A., Richoux, F., Churchill, D., & Preuss, M. (2024). RTS AI problems and techniques. In Encyclopedia of Computer Graphics and Games (pp. 1595-1605). Cham: Springer International Publishing.
Safadi, F., Fonteneau, R., & Ernst, D. (2011). Artificial intelligence design for real-time strategy games. In NIPS Workshop on Decision Making with Multiple Imperfect Decision Makers.
Ontanón, S., & Buro, M. (2015, July). Adversarial hierarchical-task network planning for complex real-time games. In Proceedings of the 24th International Conference on Artificial Intelligence (pp. 1652-1658).
Uriarte, A., & Ontanón, S. (2014). Game-tree search over high-level game states in RTS games. In Proceedings of the AAAI Conference on Artificial Intelligence and Interactive Digital Entertainment (Vol. 10, No. 1, pp. 73-79).
Vien, N. A., & Toussaint, M. (2015, March). Hierarchical monte-carlo planning. In Proceedings of the AAAI Conference on Artificial Intelligence (Vol. 29, No. 1).
Barriga, N. A., Stanescu, M., & Buro, M. (2017). Game tree search based on nondeterministic action scripts in real-time strategy games. IEEE Transactions on Games, 10(1), 69-77.
Erickson, G., & Buro, M. (2014). Global state evaluation in StarCraft. In Proceedings of the AAAI Conference on Artificial Intelligence and Interactive Digital Entertainment (Vol. 10, No. 1, pp. 112-118).
Resources
Erickson, G., & Buro, M. (2014). Global state evaluation in StarCraft. In Proceedings of the AAAI Conference on Artificial Intelligence and Interactive Digital Entertainment (Vol. 10, No. 1, pp. 112-118).
Churchill, D., Buro, M., & Kelly, R. (2019, August). Robust continuous build-order optimization in starcraft. In 2019 IEEE Conference on Games (CoG) (pp. 1-8). IEEE.
Uriarte, A., & Ontañón, S. (2017, August). Single believe state generation for partially observable real-time strategy games. In 2017 IEEE conference on computational intelligence and games (CIG) (pp. 296-303). IEEE.
Lelis, L. H. (2017, August). Stratified Strategy Selection for Unit Control in Real-Time Strategy Games. In IJCAI (pp. 3735-3741).
Ng, P. H., Li, Y. J., & Shiu, S. C. (2011, June). Unit formation planning in RTS game by using potential field and fuzzy integral. In 2011 IEEE International Conference on Fuzzy Systems (FUZZ-IEEE 2011) (pp. 178-184). IEEE.
Liu, S., Louis, S. J., & Nicolescu, M. (2013, August). Using CIGAR for finding effective group behaviors in RTS game. In 2013 IEEE Conference on Computational Inteligence in Games (CIG) (pp. 1-8). IEEE.
Balla, R. K., & Fern, A. (2009). Uct for tactical assault planning in real-time strategy games. international join conference of artifical intelligence, ijcai.
Resources
Churchill, D., Preuss, M., Richoux, F., Synnaeve, G., Uriarte, A., Ontanón, S., & Čertický, M. (2024). Starcraft bots and competitions. In Encyclopedia of Computer Graphics and Games (pp. 1742-1759). Cham: Springer International Publishing.
Buro, M., & Furtak, T. M. (2004, May). RTS games and real-time AI research. In Proceedings of the Behavior Representation in Modeling and Simulation Conference (BRIMS) (Vol. 6370, pp. 1-8).
Adil, K., Jiang, F., Liu, S., Jifara, W., Tian, Z., & Fu, Y. (2017). State-of-the-art and open challenges in RTS game-AI and Starcraft. Int. J. Adv. Comput. Sci. Appl, 8(12), 16-24.
Wang, Z., Nguyen, Q. K., Thawonmas, R., & Rinaldo, F. (2013). Adopting scouting and heuristics to improve the ai performance in starcraft. Innovations in Information and Communication Science and Technology.
Si, C., Pisan, Y., & Tan, C. T. (2014, December). A scouting strategy for real-time strategy games. In Proceedings of the 2014 Conference on Interactive Entertainment (pp. 1-8).
Hostetler, J., Dereszynski, E. W., Dietterich, T. G., & Fern, A. (2012). Inferring strategies from limited reconnaissance in real-time strategy games. arXiv preprint arXiv:1210.4880.
Churchill, D., Saffidine, A., & Buro, M. (2012). Fast heuristic search for RTS game combat scenarios. In Proceedings of the AAAI conference on artificial intelligence and interactive digital entertainment (Vol. 8, No. 1, pp. 112-117).
Resources
Churchill, D., & Buro, M. (2013, August). Portfolio greedy search and simulation for large-scale combat in StarCraft. In 2013 IEEE Conference on Computational Inteligence in Games (CIG) (pp. 1-8). IEEE.
Churchill, D., & Buro, M. (2013, August). Portfolio greedy search and simulation for large-scale combat in StarCraft. In 2013 IEEE Conference on Computational Inteligence in Games (CIG) (pp. 1-8). IEEE.
Moraes, R., Mariño, J., & Lelis, L. (2018, September). Nested-greedy search for adversarial real-time games. In Proceedings of the AAAI Conference on Artificial Intelligence and Interactive Digital Entertainment (Vol. 14, No. 1).