1 of 136

NAIL122 AI for Games�The AI of RTS games��Peter Guba, Jakub Gemrot

Faculty of Mathematics and Physics

Charles University

2 of 136

What are RTS games?

RTS = Real-Time Strategy

Generally any game that is played in real-time and involves strategizing

More narrow meaning – games like Starcraft, Age of Empires, Command & Conquer

3 of 136

What are RTS games?

RTS = Real-Time Strategy

Generally any game that is played in real-time and involves strategizing

More narrow meaning – games like Starcraft, Age of Empires, Command & Conquer

This is what we’re going to take a look at today.

4 of 136

If you want to give RTS AI a try

Student StarCraft AI Tournament

AIIDE StarCraft AI Competition

CIG StarCraft AI Competition

https://store.steampowered.com/app/464350/Screeps_World/

5 of 136

What makes these games difficult?

Decision-making under uncertainty
Opponent modelling and learning
Spatial reasoning
Temporal reasoning
Resource management
Collaboration
Real-time planning

6 of 136

Standard techniques

Most commercial titles use FSMs, Hierarchical FSMs or Behaviour Trees

7 of 136

Standard techniques

Most commercial titles use FSMs, Hierarchical FSMs or Behaviour Trees

8 of 136

Standard techniques

Most commercial titles use FSMs, Hierarchical FSMs or Behaviour Trees

These can get quite complicated

Usually don’t provide a challenge to even moderately skilled players

9 of 136

Advanced architecture examples

10 of 136

Plan

Terrain analysis
Unit tactics
Scouting
Building placement
Resource management
High-level strategy

11 of 136

Terrain analysis

12 of 136

How to represent the terrain?

First step – partitioning the map into relevant regions

Can be done by hand or using an algorithm

13 of 136

How to represent the terrain?

Useful for:

Recognizing dangerous areas
Choosing attack positions
Pathfinding
Overall tactical reasoning

14 of 136

How to represent the terrain?

Cul-de-sac = dead end

Chokepoint = narrow places connecting two or more regions

We can use our abstraction to find these

15 of 136

How to represent the terrain?

Cul-de-sac = dead end

Chokepoint = narrow places connecting two or more regions

We can use our abstraction to find these

Region that you can only get to from one other region.

16 of 136

How to represent the terrain?

Cul-de-sac = dead end

Chokepoint = narrow places connecting two or more regions

We can use our abstraction to find these

One approach - filter out regions that aren’t narrow, then for each remaining one take all of its adjacent regions and start BFS from them while excluding the candidate region. To optimize this, the BFS algorithm can be cut short.

17 of 136

How to represent the terrain?

Cul-de-sac = dead end

Chokepoint = narrow places connecting two or more regions

We can use our abstraction to find these

Then, we can use them – important buildings can be placed in cul-de-sacs, chokepoints are good points for ambushes, or they can be efficiently guarded

18 of 136

How to represent the terrain?

We can combine our regions with influence maps and potential fields to help guide our units

19 of 136

How to represent the terrain?

Influence maps

Each unit has some influence

This propagates outwards

Influence of allies is additive, influence of enemies subtractive

20 of 136

How to represent the terrain?

Influence maps

Can be used to make decisions about where to defend/attack

Can also lead to interesting emergent behaviours

Can easily be used to encourage cooperation between two or more AI players

21 of 136

How to represent the terrain?

Potential fields

Fields of forces that attract/repulse units

Can be computed dynamically based on enemy unit positions or other criteria

22 of 136

Unit tactics

23 of 136

What to do in a fight?

Influence maps and potential fields can help with reasonable pathfinding

They can also help units get into good formations

What to do once we get into a fight?

24 of 136

What to do in a fight?

Simplest solution – fully scripted behaviours

Can achieve ok results (especially when based on tactics used by human players)

25 of 136

Examples

Random Attack-Closest

[RND]

[AC]

Attack-Weakest [AW]

Kiting

Attack-Value

[Kit]

[AV]

Picks a legal move with uniform prob. distribution

Attack closest in range if able
If reloading, wait
If not in range, move to range

Dtto AC, but: 1. attacks weakest in range

Dtto AC, but: 2. move away from closest enemy Dtto AC, but: 1. attacks 𝑢 w/ highest 𝑢. 𝑑𝑝𝑓/ 𝑢. ℎ𝑝

No-OverKill-Attack-Value

[NOK-AV]

Dtto AV, but: will no try to attack unit that was already

dealt with lethal damage

Kiting-AV [Kit-AV] Dtto Kit, but: 1. attacks 𝑢 w/ highest 𝑢. 𝑑𝑝𝑓/ 𝑢. ℎ𝑝

Scripted strategy w/ script X: assign script X to all units a player controls

What to do in a fight?

26 of 136

What to do in a fight?

Simplest solution – fully scripted behaviours

Can achieve ok results (especially when based on tactics used by human players)

Can be combined using FSMs for less predictability

More advanced solution – using some script assignment/planning algorithm

27 of 136

What to do in a fight?

Portfolio Greedy Search (PGS)

Use set of scripts

Try to find a good assignments to units using greedy approach, by simulating possible exploitation by the opponent

28 of 136

Portfolio Greedy Search�a.k.a. PGS

29 of 136

Portfolio Greedy Search�a.k.a. PGS

30 of 136

Portfolio Greedy Search�a.k.a. PGS

PortfolioGreedySearch sections:

Row 7 <-> [A] … default enemy

Row 8 <-> [B] … seed yourself

Row 9 <-> [C] … seed enemy

Rows 10-13 <-> [D] … improve

31 of 136

Portfolio Greedy Search�a.k.a. PGS

32 of 136

Portfolio Greedy Search�a.k.a. PGS

33 of 136

Portfolio Greedy Search�a.k.a. PGS

P1

P2

34 of 136

What to do in a fight?

Nested Greedy Search (NGS)

Like PGS, but adds one level of recursion

For every assignment that a given player tries, the opponent can change their assignment once

35 of 136

Nested Greedy Search�a.k.a. NGS

36 of 136

Nested Greedy Search�a.k.a. NGS

BTW, notation could not have been more confusing here…

NestedGreedySearch sections:

Row 1,3 <-> [A] … default me

Row 2,4 <-> [B] … default enemy

Row 5 <-> [C] … while we have time

Rows 6-10 <-> [C1] … check on reassignments

Row 9 <-> [C2] … should we reassign?

37 of 136

Nested Greedy Search�a.k.a. NGS

38 of 136

Nested Greedy Search�a.k.a. NGS

39 of 136

Nested Greedy Search�a.k.a. NGS

NestedGreedySearch notes:�Having GS on the second ply of a game tree effectively means, that we move non-convergence down by one ply.

40 of 136

What about proper planning?

Using proper planning algorithms based on tree search

How to deal with simultaneous and durative actions though?

41 of 136

What about proper planning?

Alpha Beta Considering Durations (ABCD)

Adaptation of Alpha Beta Search to games with simultaneous and durative moves

Takes durative actions into account by always rolling the game state forward until at least one player can do something

Splits simultaneous move nodes into two – FIRST and SECOND – and their moves get applied together

42 of 136

What about proper planning?

Alpha Beta Considering Durations (ABCD)

Adaptation of Alpha Beta Search to games with simultaneous and durative moves

Takes durative actions into account by always rolling the game state forward until at least one player can do something

Splits simultaneous move nodes into two – FIRST and SECOND – and their moves get applied together

This is NOT a theoretically sound model for simultaneous move games!

43 of 136

What about proper planning?

Alpha Beta Considering Durations (ABCD)

Adaptation of Alpha Beta Search to games with simultaneous and durative moves

Takes durative actions into account by always rolling the game state forward until at least one player can do something

Splits simultaneous move nodes into two – FIRST and SECOND – and their moves get applied together

Different strategies for picking who goes first – alternating, random, 1-2-2-1 alternation…

This is NOT a theoretically sound model for simultaneous move games!

44 of 136

s – state

d – depth to search

m₀– delayed action effect used for simultaneous nodes

𝜶, 𝜷 – bounds

Meant to be used with iterative deepening.

45 of 136

First, mind the real-time constraints. Note that this ABCD should be run in “iterative deepening” manner, thus timeout

means “do not use this

result at all”.

46 of 136

If we are in terminal node (either depth reaches zero or maximal time for scenario is reached), we return valuation of the state.

47 of 136

This line condense a lot

of stuff.

[A] If none unit can perform actions

(different then “pass”), advance the time to the point some unit may perform an action first.

48 of 136

This line condense a lot of stuff.

[B] Given the current state, i.e., state of units, determine which players

can move.

If at this stage, we are in simultaneous node, use “policy” to determine,

which player will make its

decision first.

49 of 136

Next, we are iterate over moves player “toMove” can do, save current

batch of moves we’re going to investigate into m.

50 of 136

If we are in simultaneous node, and there is nothing in m₀buffer, i.e., this ABCD has not been called from simultaneous node …

51 of 136

… then we continue with next ply of ABCD, but (!) passing actions “m” as an argument.

This “m” will act as m₀in next invocation, so we will not get into this branch next time.

52 of 136

Additionally, if we are near the end of the search depth, we do not bother resolving simultaneous node as

we’re terminating

anyway.

53 of 136

The else branch is then about solving the

“delayed action” effect.

54 of 136

First, this version of ABCD is not using reversible action. So even though we are performing DFS, we clone the state.

55 of 136

Then, we are solving the delayed action effect, if m₀is containing some actions, we apply them here …

56 of 136

… before applying currently selected actions.

57 of 136

The rest of the algorithm is standard alpha-beta prunning.

58 of 136

The state evaluation function is used here to evaluate nodes at certain depths.

59 of 136

LTD3(s): playout; instead of evaluation finish the game by performing a playout, which means, on each unit decision point, use preselected script to select an action. Play until either or both sides are annihilated.��Other possibilities, e.g. Lanchester’s Attrition Laws

60 of 136

LTD3(s): playout; instead of evaluation finish the game by performing a playout, which means, on each unit decision point, use preselected script to select an action. Play until either or both sides are annihilated.��Other possibilities, e.g. Lanchester’s Attrition Laws

Hit Points

Damage Per Frame

61 of 136

LTD3(s): playout; instead of evaluation finish the game by performing a playout, which means, on each unit decision point, use preselected script to select an action. Play until either or both sides are annihilated.��Other possibilities, e.g. Lanchester’s Attrition Laws

Makes having more units with lower HP better than having fewer units with a lot of HP.

62 of 136

If move ordering is done right, it improves the effect of alpha-beta pruning as better state values are found faster.

63 of 136

As we are running ABCD using iterative deepening, we can store information about promising moves from previous runs. This can be used to sort actions in consecutives ABCD invocations, allowing it to run faster. If no such information is available, we can still use a script to suggest “first move”.

64 of 136

What about proper planning?

UCT Considering Durations (UCTCD)

Adaptation of MCTS to games with simultaneous and durative moves

Uses the same techniques as ABCD

65 of 136