Algoritmen en Datastructuren (ALDAT) EVMINX4 Week 5
Games Spel met 2 spelers, die elk afwisselend aan de beurt zijn. Boter Kaas en Eieren (Tic-Tac-Toe). 4 op een rij. Zeeslag Dammen, Schaken, Go, … Hoe vind je de beste zet? Bouw een game tree. Pas het min-max algoritme toe. Gebruik alfa-beta pruning om min-max sneller te maken.
Game tree X X X X OX O X O X O X X O X X O XX Zie Handouts
Game tree Backtracking algoritme met minimax strategie! Zie Handouts
TicTacToe::chooseMove int TicTacToe::chooseMove(Side s, int& bestRow, int& bestColumn) { Side opp(s==COMPUTER ? HUMAN : COMPUTER); int value(s==COMPUTER ? HUMAN_WIN : COMPUTER_WIN); int simpleEval(positionValue()); if (simpleEval!=UNCLEAR) return simpleEval; for (int r(0); r<board.numrows(); ++r) for (int c(0); c<board.numcols(); ++c) if (squareIsEmpty(r, c)) { place(r, c, s); int dc; int reply(chooseMove(opp, dc, dc)); place(r, c, EMPTY); if (s==COMPUTER && reply>value || s==HUMAN && reply<value) { value=reply; bestRow=r; bestColumn=c; } return value; }
Tic-Tac-Toe aantal aanroepen chooseMove bij eerste zet (computer begint) Maximaal: 1+9+9x8+9x8x x8x7x6x5x4x3x2x1 = Stoppen als er een winnaar is = Toepassen alpha-beta pruning = Toepassen transposition table = 7954 Zoek identieke stellingen (draaien en spiegelen) = 5204
Alfa-Beta pruning Zie p. 398 (Weiss). After H2A is evaluated, C2, which is the minimum of the H2’s, is at best a draw. Consequently, it cannot be an improvement over C1. We therefore do not need to evaluate H2B, H2C, and H2D, and can proceed directly to C3.
Minmax algoritme voorbeeld MAX MIN
Alfa-beta pruning voorbeeld <= >= <= >=4 <=4 >=4 MAX MIN
Alfa-beta pruning pair cComp(int p, int a, int b, int d); pair cHuman(int p, int a, int b, int d); pair cComp(int p, int a, int b, int d) { int bestPos(p); int value(positionValue(p)); if (value==UNCLEAR) { value=a; for (int i(1); i<3 && a<b; ++i) { pair r(cHuman(2*p+i, a, b, d+1)); if (r.first>a) { value=r.first; a=value; bestPos=2*p+i; } return make_pair(value, bestPos); } return pair = best value, best position. p=position a=alfa b=beta d=dept
Alfa-beta pruning pair cComp(int p, int a, int b, int d); pair cHuman(int p, int a, int b, int d); pair cHuman(int p, int a, int b, int d) { int bestPos(p); int value(positionValue(p)); if (value==UNCLEAR) { value=b; for (int i(1); i a; ++i) { pair r(cComp(2*p+i, a, b, d+1)); if (r.first<b) { value=r.first; b=value; bestPos=2*p+i; } return make_pair(value, bestPos); } return pair = best value, best position. p=position a=alfa b=beta d=dept
Alfa-beta pruning enum S {H, C}; pair cMove(S s, int p, int a, int b, int d) { int bestPos(p); int value(positionValue(p)); if (value==UNCLEAR) { value=(s==C)?a:b; for (int i(1); i<3 && a<b; ++i) { pair r( cMove((s==C)?H:C, 2*p+i, a, b, d+1) ); if (s==C&&r.first>a || s==H&&r.first<b) { value=r.first; if (s==C) a=value; else b=value; bestPos=2*p+i; } return make_pair(value, bestPos); } return pair = best value, best position. p=position a=alfa b=beta d=dept s=side H= Human C = Computer
Alpha-beta pruning int TicTacToe::chooseMove(Side s, int& bestRow, int& bestColumn, int alpha, int beta) { Side opp(s==COMPUTER ? HUMAN : COMPUTER); int value(s==COMPUTER ? alpha : beta); int simpleEval(positionValue()); if (simpleEval!=UNCLEAR) return simpleEval; for (int r(0); r<board.numrows(); ++r) for (int c(0); c<board.numcols(); ++c) if (squareIsEmpty(r, c)) { place(r, c, s); int dc; int reply(chooseMove(opp, dc, dc, alpha, beta)); place(r, c, EMPTY); if (s==COMPUTER && reply>value || s==HUMAN && reply<value) { value=reply; if (s==COMPUTER) alpha=value; else beta=value; bestRow=r; bestColumn=c; if (alpha>=beta) return value; } } return value; }
Transpostions Zie p. 400 (Weiss). Two searches that arrive at identical positions.
Transpositions class Position { public: Position(const matrix & theBoard): board(theBoard) { } bool operator<(const Position& rhs) const; private: matrix board; }; bool Position::operator<(const Position & rhs) const { for (int i(0); i<board.numrows(); ++i) for (int j(0); j<board.numcols(); ++j) if (board[i][j]!=rhs.board[i][j]) return board[i][j]<rhs.board[i][j]; return false; } class TicTacToe { //... private: map transpositions; }; Position is een wrapper (inpakker) voor matrix board. Waarom is dit nodig?
Transpositions int TicTacToe::chooseMove(Side s, int& bestRow, int& bestColumn, int alpha, int beta, int depth) { Position thisPosition(board); if (depth>=3 && depth<=5) { MapItr itr(transpositions.find(thisPosition)); if (itr!=transpositions.end()) return (*itr).second; } // idem... int reply(chooseMove(opp, dc, dc, alpha, beta, depth+1)); // idem... if (alpha>=beta) goto Done; } Done: if (depth>=3 && depth<=5) transpositions[thisPosition]=value; return value; } Verklaar?