javascript - 評価関数に関連する negamax アルゴリズムのバグ? 機能する場合と機能しない場合がある

Question

無敵の AI を使用して Tic-Tac-Toe ゲームを開発しようとしていますが、ほとんどの場合、negamax 関数が正しい出力を返すようになっています。ただし、予測可能な特定の条件下では、コンピューターが意味をなさないボードの動きを選択し、ユーザーの勝利をブロックできない場合があります。

私が行った調査に基づいて、negamax アルゴリズム自体は問題ないようです。評価関数に何か問題があるのではないかと心配しています。言い換えれば、ヒューリスティック値が計算される方法です。私のエネルギーのほとんどはネガマックスに取り組むことに向けられていたので、他の機能はほとんど後付けでした.

以下は、私が説明していることのほんの一例です。

// user: X, computer: O, turn: computer
// this scenario should return 1 to block user from [0, 1, 2]
// returns 5 when negamax called with compNum
// output is correct when negamax called with userNum (simulating user's turn)
var scenario01 = [
    userNum,    0,    userNum,
    userNum, compNum,    0,
    compNum,    0,       0
];

このシナリオに直面した場合、正方形 1 (上部中央: 配列のインデックスはゼロ) を選択してユーザーが勝利を収めるのをブロックするのではなく、negamax 関数は 5 を出力します。これは、どちらのプレーヤーにもすぐには使用されない正方形です。ただし、テストでは、引数として userNum を使用して同じシナリオで negamax を呼び出すと、negamax はその仕事を行い、ユーザーの勝利の正方形 1 を見つけます。

一見シンプルなものが欠けていることはわかっています。それは、userNum と compNum を設定した方法と、それらがハードコードされている方法に関連していると感じています。Minimax のアイデアをネガマックスに変換することに失敗した可能性があります。つまり、ユーザーとコンピューターの両方の価値を最大化していない可能性があります。または、間違ったプレイヤーの視点から最大化を行っている可能性があります。誰かが親切に正しい方向へのプッシュを提供してくれたら、私は感謝します.

完全な動作バージョンとテストは、https ://repl.it/CnGD/4 で入手できます。

圧倒的な量をここに投稿したくはありませんが、リクエストがあれば、このサイトでスニペットを作成できます。

ネガマックス関数

var negamax = (board, player, depth, lvl, α, β) => {
if ((board.getWinner(userNum, compNum) !== null) || depth >= lvl) {
    return board.getWinner(userNum, compNum);
    // also tried:  return board.score(depth) * player;
}
var highScore = -Infinity;
board.available().forEach(function (move) {
    board.mark(move, player);
    var score = -negamax(board, -player, depth + 1, lvl, -β, -α);
    board.undo(move);
    if (score > highScore) { // if better cell is found
        highScore = score; // note best score so far
        bestMove = move; // note best move so far
        α = Math.max(α, score);
        if (α >= β) { // alpha-beta pruning
            return bestMove;
        }
    }
});
return bestMove; };

評価関数

// these functions are part of a Board prototype
// I haven't included it all here, but will provide a link
getWinner: function (userNum, compNum) {
    return (
        this.validateWin(this.occupied(userNum)) ? userNum :
        this.validateWin(this.occupied(compNum)) ? compNum :
        this.matchAll(true) ? 0 :
        null
    );
},

score: function (depth) {
    if (board.getWinner(userNum, compNum) === userNum) {
        return depth - 10;
    } else if (board.getWinner(userNum, compNum) === compNum) {
        return 10 - depth; 
    } else {
        return 0;
    }
}

javascript - 評価関数に関連する negamax アルゴリズムのバグ? 機能する場合と機能しない場合がある

ネガマックス関数

評価関数

0 に答える 0

Related

Reference