A functional form game is described by an antisemetric function
that evaluates a pair of agents
Where the higher the value of , the better for agent . is a tie.
The proof here is that every game can be transformed into a sum of a transitive and a cyclic game.
A game is transitive if there is a rating function such that performance on the game is the difference in ratings:
Note that this optimal value for one player is not dependent on the other player, and so it can be found by training against a fixed opponent at least in theory. In practice, when the probability of winning of either side gets to be too high learning slows down significantly, so playing against an opponent of similar strength is best, which is why self-play works so well.
A game is cyclic if
In other words, wins against some agents are necessarily counterbalanced with losses against others.
An example of an entirely cyclic game is the disk game.
Let
\phi(v,w) = v^T \cdot \left( \begin{smallmatrix} 0&-1\\ 1&0 \end{smallmatrix} \right0)\cdot wNote that Rock Paper Scissors embeds in the disk game:
Here is another time Deepmind shows off this game in their blog.
@article{DBLP:journals/corr/abs-1901-08106,
author = {David Balduzzi and
Marta Garnelo and
Yoram Bachrach and
Wojciech M. Czarnecki and
Julien P{\'{e}}rolat and
Max Jaderberg and
Thore Graepel},
title = {Open-ended Learning in Symmetric Zero-sum Games},
journal = {CoRR},
volume = {abs/1901.08106},
year = {2019},
url = {http://arxiv.org/abs/1901.08106},
archivePrefix = {arXiv},
eprint = {1901.08106},
timestamp = {Sat, 02 Feb 2019 16:56:00 +0100},
biburl = {https://dblp.org/rec/journals/corr/abs-1901-08106.bib},
bibsource = {dblp computer science bibliography, https://dblp.org}
}