Adventures in Machine Learning

Mastering Strategy Games: Optimizing the Minimax Algorithm with Alpha-Beta Pruning

Optimizing Minimax Algorithm

Do you love playing strategy games, such as chess, Go, or even tic-tac-toe? These games may seem simple, but under the hood, there are complex algorithms at work.

One of the most important algorithms for strategy games is the Minimax Algorithm. It is a recursive algorithm that is used to find the best move for the current player, assuming the opponent plays optimally as well.

However, as the game progresses, the number of possible game states increases exponentially, making it impossible to evaluate all of them. In this article, we will explore three techniques to optimize the Minimax Algorithm: caching, alpha-beta pruning, and Monte Carlo tree search.

The Challenge of Huge Game Trees

Let us consider the Simple-Nim game, where two players take turns removing counters from a pile. The player who takes the last counter wins.

The game tree of Simple-Nim grows exponentially with each move, making it infeasible to evaluate all the states and their respective values. For example, if there are only four counters in the pile, the first player has four possible moves: removing one, two, three, or four counters.

After any of these moves, the second player has three possible moves, and so on. Therefore, after each move, the number of possible states is reduced, but still, the branching factor is high, and the depth is also significant, making exhaustive evaluation impractical.

Caching to Speed Up the Code

Caching is a simple technique that can be applied to many recursive algorithms. The idea is to store the computed values of expensive function calls to avoid recomputing them.

For example, in the context of Simple-Nim, we can store the values of each game state and its corresponding minimax value in a dictionary. Then, when we encounter the same state again during the recursion, we can retrieve the stored value instead of recomputing it.

This can lead to a significant speed-up in the code and makes it possible to evaluate deeper game trees. However, caching comes with the cost of memory consumption, and we may need to limit the size of the cache to avoid running out of memory.

Alpha-Beta Pruning to Reduce Tree Size

Alpha-Beta pruning is a technique that reduces the number of game states that need to be evaluated by cutting off subtrees that are guaranteed to be worse than the best known move for the current player. The algorithm maintains two values: alpha and beta.

Alpha is the best value found so far for the maximizing player, and beta is the best value found so far for the minimizing player. During the recursion, if the value of a node is worse than alpha or better than beta, we can prune the entire subtree because we know the opponent will never choose that path since there is a better option already available.

With this technique, we can reduce the number of states evaluated by a significant amount, making it possible to evaluate even deeper trees. However, the effectiveness of alpha-beta pruning depends on the ordering of the nodes.

If the best moves are at the end of the list, pruning may not happen until later, reducing the effectiveness of the algorithm.

Monte Carlo Tree Search for Approximate Solutions

Monte Carlo tree search is a different approach to evaluating game trees. Instead of trying to evaluate all possible game states, it uses random simulation to explore the most promising paths.

The algorithm starts from the current game state and selects a move based on a formula that balances exploration (trying new moves) and exploitation (choosing the best move so far). Then, it simulates the game until the end by selecting random moves for both players.

Finally, it backpropagates the result to update the statistics of the visited nodes. Monte Carlo tree search can provide approximate solutions quickly, but the quality of the results depends on the number of simulations performed.

Therefore, it is a good choice for complex games where evaluating the exact value of each state is not feasible, but exploration can give some insights into the game.

Conclusion

Minimax Algorithm is a powerful algorithm used in games to find the best move based on its opponent’s move. However, as the size of the game tree increases, it becomes infeasible to evaluate all possible game states.

In this article, we explored three techniques to optimize the Minimax Algorithm: caching, alpha-beta pruning, and Monte Carlo tree search. Each of these techniques has its own advantages and disadvantages, making it essential to choose the right approach based on the characteristics of the game.

Implementing Alpha-Beta Pruning

We have already learned the basic concept of the Minimax Algorithm and how it works to find the optimal move for a player in a game. Alpha-Beta Pruning is an optimization technique used in conjunction with the Minimax algorithm to cut off the branches that are no longer needed.

In this article, we will learn how to implement Alpha-Beta Pruning and make use of it to play Simple-Nim better, a game of chance in which you need to remove counters to try and get your opponent to remove the final counter.

Changing Comprehension to For Loop

When we first learned about the Minimax algorithm and how to implement it, we used a list comprehension to generate the scores for each child node. However, we cannot use a list comprehension with Alpha-Beta Pruning because we will need to keep track of alpha and beta values.

Therefore, we need to use a for loop instead.


def alpha_beta_pruning(depth, nodeIndex, maximizingPlayer, values, alpha, beta):
if depth == 3:
return values[nodeIndex]
if maximizingPlayer:
best = -100000
for i in range(0, len(values)):
if values[i] == -1:
continue
best = max(best, alpha_beta_pruning(depth + 1, i, False, values, alpha, beta))
alpha = max(alpha, best)
if alpha >= beta:
break
return best
else:
best = 100000
for i in range(0, len(values)):
if values[i] == -1:
continue
best = min(best, alpha_beta_pruning(depth + 1, i, True, values, alpha, beta))
beta = min(beta, best)
if alpha >= beta:
break
return best

In this implementation, we keep track of alpha and beta by updating their values as we traverse the tree.

If alpha is greater than or equal to beta, the loop breaks, and we return to the recursive step, which means we have already found the optimal value.

Adding Starting Values for Alpha and Beta

To initialize alpha and beta, we use negative infinity and positive infinity, respectively. This ensures that the first iteration of the algorithm updates the values using the first child node, allowing us to perform the pruning in subsequent iterations.


def alpha_beta_pruning(depth, nodeIndex, maximizingPlayer, values, alpha=-math.inf, beta=math.inf):
if depth == 3:
return values[nodeIndex]
if maximizingPlayer:
best = -100000
for i in range(0, len(values)):
if values[i] == -1:
continue
best = max(best, alpha_beta_pruning(depth + 1, i, False, values, alpha, beta))
alpha = max(alpha, best)
if alpha >= beta:
break
return best
else:
best = 100000
for i in range(0, len(values)):
if values[i] == -1:
continue
best = min(best, alpha_beta_pruning(depth + 1, i, True, values, alpha, beta))
beta = min(beta, best)
if alpha >= beta:
break
return best

Breaking Out of the Loop to Prune the Tree

In the Alpha-Beta Pruning algorithm, when we find a better move than the current alpha or beta value, we can prune the remaining branches that will not contribute to the final decision value. Therefore, we need to break out of the loop when alpha is greater than or equal to beta.

We can add this conditional check at the end of the loop to break out and return the best value without evaluating the remaining child nodes.


...
if alpha >= beta:
break
return best

The Outcome of Alpha-Beta Pruning

The implementation of Alpha-Beta Pruning is useful when we have a game tree with a large number of nodes and cannot process all nodes. The alpha and beta values work to save time on evaluating child nodes because we know we can prune certain branches of the game tree.

By adding conditions that give up the search in certain cases, we can speed up the evaluation of the game tree and improve performance.

Using Minimax with Alpha-Beta Pruning for Simple-Nim

Now that we have learned how to implement Alpha-Beta Pruning, we can apply it to the Simple-Nim game. Simple-Nim is a two-player game in which players take turns removing stones from a pile.

The player who removes the last stone wins. We can implement the game by using the Minimax algorithm with Alpha-Beta Pruning.

The rules of the game are simple, and it allows us to test our implementation of the algorithm. Suppose the pile contains ten stones, and the first turn is taken by the maximizing player.

The player can remove one, two, or three stones; then, the minimizing player will remove one, two, or three stones. The game continues in this way until someone removes the last stone.

The implementation will evaluate the possible new states of the game and return an optimal move, i.e., how many stones should be removed from the pile to maximize the chances of winning.

Conclusion

In conclusion, implementing Alpha-Beta Pruning has allowed us to optimize the Minimax algorithm and reduce the number of nodes that we need to evaluate in a game tree. Using alpha and beta values allowed us to speed up the search by enabling pruning and not evaluating all of the child nodes in the tree.

In addition, by providing starting values for alpha and beta, we could start the algorithm smoothly and ensure we prune nodes in subsequent iterations. We have seen how Alpha-Beta Pruning works in a Simple-Nim game, and we can apply this method to solve complex games, making it easier and quicker to choose the best move to win the game.

The Minimax algorithm is a fundamental algorithm for strategy games, but evaluating all possible game states can be infeasible as the game progresses. Alpha-Beta pruning is an optimization technique that reduces the number of game states to consider, making it possible to evaluate even the complex games.

By adding starting values for alpha and beta and breaking out of the loop when alpha is greater than or equal to beta, we can prune certain branches of the game tree, speeding up the evaluation considerably. Alpha-Beta pruning is a powerful technique for optimizing the Minimax algorithm, and implementing it can lead to better performance in games.

Popular Posts