1 of 17

Game Theory and Machine Learning

with relevance to Biological Development

2 of 17

Why use game theory?

Models deal with imperfect information. Models also deal with interactions between agents (or networks) and need to generate context given previous interactions.

Game theory (Wikipedia): mathematical models of conflict and cooperation between intelligent decision-makers.

Basic unit of analysis: payoff matrix (example: theory of cooperative sedentism -- all groups agree to not move, leads to lack of conflict)

Move

No move

Move

0

-2

No move

-2

1

3 of 17

Deep Learning (DL)

A survey of game theoretic approach for adversarial machine learning.

Zhou, Kantarcioglu, Xi (2018) WIREs Data Mining and Knowledge Discovery, doi:10.1002/widm.1259

DeepStack: Expert-Level Artificial Intelligence in No-Limit Poker

Moravcık et.al (2017) arXiv, 1701.01724

Reinforcement Learning (RL)

A Unified Game-Theoretic Approach to Multiagent Reinforcement Learning

Lamont et.al (2017) arXiv, 1711.00832

4 of 17

Generator

(Adversarial Examples)

Discriminator

(distinguish real/fake examples)

vs.

5 of 17

Generator

(Adversarial Examples)

Discriminator

(distinguish real/fake examples)

Generator and Discriminator play a zero-sum noncooperative (minimax) game.

vs.

6 of 17

Generator

(Adversarial Examples)

Discriminator

(distinguish real/fake examples)

Generator and Discriminator play a zero-sum noncooperative (minimax) game.

In cases where problem is non-convex, cost functions oscillate as players try to max/min.

vs.

7 of 17

Nash Equilibrium

Gains and losses of utility for one player are balanced by losses or gains for other player.

Matching pennies is an example of a game where no Nash equilibrium exists: outcomes are random (not payoffs), switching strategies should be common (and occasionally advantageous for one player).

Example: “Brain” and “Mind” flip coins and each reveal either “heads” or “tails”. If coins match, “Brain” wins. If coins do not match, “Mind” wins.

HEADS

TAILS

HEADS

Brain (1), Mind (-1)

Brain (-1), Mind (1)

TAILS

Brain (-1), Mind (1)

Brain (1), Mind (-1)

8 of 17

Semantics, Representations and Grammars for Deep Learning

David Balduzzi

https://arxiv.org/pdf/1509.08627.pdf

TOP: competition between art forger and museum curator with goal of Nash equilibrium.

RIGHT: application of Nash equilibrium to reinforcement learning models.

9 of 17

Self-Play

Competition between two models (or agents) on a range of basic games

Bansal, Pachocki, Sidor, Sutskever, Mordatch

Emergent Complexity via Multi-Agent Competition, arXiv, 1710.03748

Mnih et.al (2015). Human-level control through deep reinforcement learning. Nature, 518, 529–533.

10 of 17

11 of 17

Applying Game Theory to Biological Development

12 of 17

Games Against Nature

Single rational player plays a pure or mixed against nature (random, mixed strategy). Strategy: carry an umbrella (U) or not (N).

Observer Strategy

U

N

N

U

U

U

N

Payoff

0

1

-1

1

1

0

1

Random Generator (entropy of states are location-dependent.

Forecast is imperfect information,

informs deployed strategy.

13 of 17

Figure 1 in Scientific Reports, 8, 3514 (2018).

Distinctions between games against non-random (sentient) opponent and stochastic process (lottery):

  • Non-random = play strategies optimally, but sometimes non-optimally (e.g. pure and mixed strategies).

  • Stochastic process = a “win” will occur at a given probability or likelihood (e.g. Polya’s Urn).

14 of 17

Mean Field Games

Study of differential games with large populations of rational players

Goal is to understand Nash equilibrium states of these systems

Mean-Field-Type Games in Engineering

arXiv, 1605.03281

Mean-field-type games

AIMS Mathematics, 2(4), 706–735

15 of 17

First developed for market where there are multiple competing firms.

  • firms enter market in a particular order, which determines their market share.

  • subsequent innovations also occur in a leader-follower fashion, which determines market dynamics.

First-mover Games (and Stackleberg competition)

EXAMPLE (from markets):

  • Firm A develops the first embryogenesis simulator (leader, has an advantage).

  • Firm B makes its own version, and enters the market two years later (follower).

  • Firm B introduces a key innovation in its next version, and becomes the leader.

16 of 17

Example: Tic-tac-toe and Minimax optimization

Tic-tac-toe is an example of first-mover dynamics.

  • Stackleberg equilibrium occurs when both players are optimal in their moves (neither player makes three-in-a-row).

  • “X” is the first-mover, “O” is the second-mover. Second-player often plays defense unless first-mover makes suboptimal move.

  • A decision tree can be used to analyze these moves (figure at left).

17 of 17

Stone et.al BioSystems, 173, 73-82 (2018)

Embryo can also be analyzed as a first-mover (Stackleberg) game:

  • advantages are in terms of size, for differentiated tissues it could be in terms of function, structure.

Sublineage 1 and 2 are established, 1 has a size/position advantage.

Second move: sublineage 1 mother divides before 2, advantage (leader).

Third move: sublineage 2 mother divides (folllower).

Fourth move: another division event in sublineage 1, further advantage for 1.