After weeks of user feedback indicating our Gomoku AI was too predictable in challenging difficulty modes, I decided it was time to implement a more sophisticated decision-making algorithm. Pattern matching and minimax with alpha-beta pruning had taken us far, but to create an AI that could genuinely surprise players and adapt to different play styles, we needed something more robust. This is where Monte Carlo Tree Search (MCTS) entered the picture.

In this article, I'll walk through my experience implementing MCTS for our Gomoku game, the challenges faced, solutions developed, and the impressive results that transformed our game's AI from a mechanical opponent to one capable of strategic decision-making.

Understanding the Limitations of Traditional Approaches

Our previous Gomoku AI relied primarily on pattern recognition and minimax with alpha-beta pruning. This worked reasonably well for detecting immediate threats and opportunities, but it had clear limitations:

  • Depth constraints: Minimax could only look ahead a limited number of moves due to the enormous branching factor in Gomoku (a 15×15 board has 225 possible positions).
  • Pattern dependency: The AI was only as good as the patterns we programmed it to recognize, leading to blind spots and predictability.
  • Evaluation function challenges: Creating a comprehensive board evaluation function that captured all strategic nuances proved incredibly difficult.

These limitations became increasingly apparent as players learned to exploit the AI's weaknesses. It was time for a more advanced approach that could discover strategies rather than following predefined patterns.

Why Monte Carlo Tree Search?

MCTS emerged as the ideal solution for several compelling reasons:

  • No domain-specific knowledge required: Unlike traditional evaluation functions, MCTS can work effectively with minimal game-specific heuristics.
  • Anytime algorithm: MCTS can be stopped at any time and will provide the best move found so far, making it perfect for adjustable difficulty levels.
  • Handles large branching factors: By sampling the search space rather than exploring it exhaustively, MCTS manages complexity efficiently.
  • Balances exploration and exploitation: The algorithm naturally balances trying new moves versus leveraging promising ones.

Plus, MCTS has a proven track record in complex games. It formed the backbone of AlphaGo's groundbreaking success and has been successfully applied to various other board games.

The Four Pillars of MCTS Implementation

My implementation follows the four standard phases of MCTS:

1. Selection

Starting from the root node (current board state), the algorithm recursively selects child nodes until reaching a leaf node. For selection, I implemented the Upper Confidence Bound for Trees (UCT) formula:

UCB1 = wins/visits + C * sqrt(ln(parentVisits)/visits)

This elegant formula balances exploitation (the win ratio term) with exploration (the second term). Setting the exploration constant C to the conventional √2 provided a good balance, though I made this configurable per difficulty level.

The code for the selection phase looks like this:

selectChild() {
    const C = 1.414; // Exploration constant (√2)
    let selectedChild = null;
    let bestScore = -Infinity;
    
    for (const child of this.children) {
        // UCB1 formula
        const exploitation = child.wins / child.visits;
        const exploration = C * Math.sqrt(Math.log(this.visits) / child.visits);
        const score = exploitation + exploration;
        
        if (score > bestScore) {
            bestScore = score;
            selectedChild = child;
        }
    }
    
    return selectedChild;
}

2. Expansion

Once a leaf node is reached, I expand the tree by adding child nodes representing possible moves from that position. For Gomoku, this could potentially be any empty position on the board, but I implemented an optimization that only considers positions adjacent to existing pieces (within a distance of 2). This dramatically reduced the branching factor without sacrificing performance, since moves far from existing pieces are rarely optimal in Gomoku.

expand(color) {
    if (this.untriedMoves.length === 0) return null;
    
    // Randomly select an untried move
    const moveIndex = Math.floor(Math.random() * this.untriedMoves.length);
    const move = this.untriedMoves[moveIndex];
    this.untriedMoves.splice(moveIndex, 1);
    
    // Create new state with this move
    const newState = this.cloneState();
    newState[move.row][move.col] = color;
    
    // Create child node
    const childNode = new MCTSNode(newState, this, move);
    this.children.push(childNode);
    
    return childNode;
}

3. Simulation

The simulation phase (also called "rollout") plays random moves from the expanded node until reaching a terminal state (win, loss, or draw). This is where I introduced a significant improvement over basic MCTS: instead of completely random playout, I implemented a "smart" simulation strategy.

In my enhanced simulation:

  • 70% of the time, the algorithm chooses the move with the highest heuristic score
  • 30% of the time, it selects a completely random valid move

The heuristic score prioritizes moves that create or block winning sequences, significantly improving simulation quality without adding much computational overhead. This semi-random approach strikes a good balance between exploration and exploitation during simulation.

4. Backpropagation

After simulation completes, the result (win, loss, or draw) is propagated back up the tree, updating statistics for each node along the path. These statistics guide future selections.

backpropagate(result) {
    let node = this;
    while (node !== null) {
        node.visits++;
        node.wins += result;
        node = node.parent;
    }
}

Parallel Computing with Web Workers

One challenge with MCTS is that running thousands of simulations can block the main JavaScript thread, causing the UI to freeze. To solve this, I implemented a Web Worker architecture that offloads MCTS computation to a background thread.

The worker implementation sends board state and configuration to a separate thread, which runs the MCTS algorithm and returns the best move once complete:

// Main thread
function findMoveWithMCTS() {
    if (!mctsWorker) {
        initMCTSWorker();
    }
    
    mctsWorker.postMessage({
        state: gameBoard2D,
        simulationCount: difficultySettings[computerDifficulty].mctsSimulations,
        color: currentPlayer
    });
}

// Worker thread
self.addEventListener('message', function(e) {
    const { state, simulationCount, color } = e.data;
    
    const bestMove = runMCTS(state, simulationCount, color);
    
    self.postMessage({
        type: 'result',
        bestMove: bestMove,
        time: performance.now() - startTime,
        simulationCount: simulationCount
    });
});

This approach keeps the UI responsive even when running thousands of simulations, which is crucial for maintaining a smooth user experience.

Integrating MCTS with Existing AI System

Rather than completely replacing our previous AI, I integrated MCTS as part of a layered decision-making process:

  1. Check for immediate winning moves (own or opponent's)
  2. If none exist, check for critical threats that need addressing
  3. If the position is not tactically critical, use MCTS for strategic decision-making
  4. Fall back to traditional methods if MCTS fails for any reason

This hybrid approach combines the strengths of pattern recognition for tactical situations with MCTS for strategic decisions. I configured the system to use MCTS primarily in Hard and Expert difficulty modes, while Easy and Medium difficulties rely more on the traditional approach with occasional randomness for natural-feeling play.

Performance Optimizations

To make the MCTS algorithm run efficiently, I implemented several optimizations:

  • Neighbor-based move generation: Only considers positions near existing pieces
  • Early termination: Stops simulation immediately when a winning position is detected
  • Lightweight state representation: Uses simple 2D arrays rather than complex objects
  • Simulation count scaling: Adjusts simulation count based on difficulty level and available time

With these optimizations, the algorithm can perform about 1,000 simulations per second on mid-range hardware, allowing for strong play with just a second or two of "thinking" time.

Difficulty Level Calibration

One of MCTS's strengths is how easily it can be adjusted for different difficulty levels. I implemented the following configuration:

  • Hard mode: 1,000 simulations with advanced heuristics
  • Expert mode: 3,000 simulations with fully optimized parameters

For lower difficulties, I maintained our traditional approach since it was easier to deliberately introduce "mistakes" in a controlled way.

Results and Player Feedback

The improvement in AI capability has been dramatic. In testing, the MCTS-powered AI demonstrates several advanced behaviors that weren't present in our previous implementation:

  • Strategic positioning for multi-pronged attacks
  • Preemptive defense against potential threats
  • Adaptability to different player styles
  • Less predictable move selection

Early player feedback has been overwhelmingly positive. Several players who previously found our AI too predictable have commented that it now "feels more human" and "seems to have a strategy." The MCTS implementation provides meaningful challenges even for skilled players while maintaining the escalating difficulty curve that makes the game accessible to beginners.

Future Improvements

While the current implementation has been successful, I'm already planning several enhancements:

  1. Opening book integration: Adding a database of strong opening moves to improve early-game play
  2. Progressive widening: A technique to handle the large branching factor even more efficiently
  3. Parallel MCTS: Utilizing multiple worker threads for even more simulations
  4. Learning component: Adding a simple learning mechanism to adapt to individual player styles over time

Each of these would further enhance the AI's capabilities while maintaining the efficient, adaptable nature of the MCTS approach.

Conclusion

Implementing Monte Carlo Tree Search in our Gomoku game has transformed the AI from a pattern-matching system to a strategic opponent capable of surprising even skilled players. The beauty of MCTS lies in its elegant balance of exploration and exploitation, allowing it to discover effective strategies rather than following prescribed patterns.

For other indie game developers considering AI improvements, I highly recommend exploring MCTS. Its flexibility, scalability, and relatively straightforward implementation make it an excellent choice for board games and other turn-based experiences. The algorithm offers a remarkable balance of implementation simplicity and gameplay sophistication that few other approaches can match.