Computing winning positions in a game
Reading time20 minIn brief
Article summary
In this article, we explain you how to find winning positions in a game, i.e., how find configurations that will surely lead to victory.
Main takeaways
-
Winning a game consists in reaching a game configuration where the game is won.
-
More broadly, winning regions denote configurations, from which a winning configuration can surely be reached.
-
Winning regions can be constructed starting from winning configurations.
-
Algorithms such as Maximin can be used to find a good strategy in a game.
Article contents
1 — Video
Please check the video below. You can also read the transcripts if you prefer.
To go further
2 — The maximin algorithm
In game theory (more precisely in complete information, zero-sum games), the maximin algorithm aims at finding the best move to make at each step, assuming the opponent will play optimally. The goal for the opponent is therefore to minimize our gain, while ours is to maximize it.
Concretely, the main idea of the algorithm is to start from the leaves of the tree of solutions, and to give a score to each of these leaves, corresponding to the gain it provides us.
Then, we will go up the tree until we reach the root, by alternating minimizations of the scores, and maximizations. This alternance models the choices of either the player, or the opponent.
2.1 — Example
Let us describe what happens on an example, taken from the French Wikipedia page on the Minimax algorithm (minimax is just the symmetric problem of maximin in which we want to minimize some loss function rather than maximizing a gain):
Here, we have a tree of possible executions of a game, in which depths modeled with squares indicate our turn to play, and circles indicate the opponent’s turn.
For each circle node, the opponent will try to minimize our gain, by choosing to perform the action that leads to the minimum gain for us. Therefore, the minimax algorithm will assume the following choices:
What it now means is that, if I reach the state of the game $B1$, the opponent will play so that our total gain is 3, obtained from state $C3$. If we make the game go to state $B2$, our opponent can play so that gain is 5 by enforcing state $C4$. Finally, if we play to reach situation $B3$, the opponent can play so that gain is 2 by enforcing state $C8$.
The optimal choice for us is therefore to maximize the gain, and to play so that we reach state $B2$. Making this choice, the best gain one can expect when facing an optimal opponent is 5:
The minimax algorithm consists in alternating these steps of minimizations and maximizations, in order to maximize the gain we would ultimately expect when making a decision.
It is important to notice that the maximin algorithm will not return the sequence of plays that lead to the highest achievable gain (which in the example above would be $C9$ with a gain of 12), but the least worst sequence of play, i.e., the one that realistically leads to our best gain.
To conclude, notice that here we alternate decisions between players, which implies they take their decisions in turn. Still, there exist equivalent results for simultaneous games such as PyRat.
To go beyond
- The minimax algorithm.
Some codes to implement a minimax algorithm.