Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
556 views
in Technique[技术] by (71.8m points)

java - Tic-Tac-Toe minimax algorithm doesn't work with 4x4 board

So I've been working on this project for the past 3 weeks now. I managed to get the minimax function to work early on for a 3x3 board, however problems started arising when I tried using it for a 4x4 board, namely Java heap space errors. Since then, with the help of Alpha beta pruning, I've managed to bring down the number of required minimax calls within the minimax function from aprox. 59000 to 16000 to 11000 and then finally to 8000 calls(This is assuming an initial minimax call for a board with one slot already filled). The problem now however, is that the method just keeps running for 4x4 games. It simply keeps calling itself non stop, no errors, no result, nothing. Theoretically, the way I see it, my function should work for arbitrary board sizes, the only problem was memory. Now, since I've reduced the memory greed of my function greatly, I'd expected it to work. Well, it does for the 3x3. However, it doesn't for the 4x4. A brief explanation of what the function does: The function returns an array of size 2 containing the most favorable next move from amongst all the possible next moves along with the score expected to be achieved from that move. The scoring system is simple. +10 for an O win, -10 for an X win and 0 for a draw. The function is of course recursive. Within it you will find certain shortcuts that reduce the number of required calls to itself. For example if it's X's turn and the returned score is -10(which is the best possible score for X) then exit the loop, i.e. stop observing other potential moves from this state. Here's the code for class State:

private String [] state;    //Actual content of the board
private String turn;        //Whose turn it is
private static int n;       //Size of the board

public State(int n) {
    state = new String[n*n];
    for(int i = 0; i < state.length; i++) {
        state[i] = "-";
    }
    State.n = n;
}


public int[] newminimax47(int z) {
    int bestScore = (turn == "O") ? +9 : -9;    //X is minimizer, O is maximizer
    int bestPos = -1;
    int currentScore;
    int lastAdded = z;
    if(isGameOver() != "Not Gameover") {
        bestScore= score();
    }
    else {
        int i = 0;
        for(int x:getAvailableMoves()) {
            if(turn == "X") {   //X is minimizer
                setX(x);
                currentScore = newminimax47(x)[0];
                if(i == 0) {
                    bestScore = currentScore;
                    bestPos = x;
                    if(bestScore == -10)
                        break;
                }
                else if(currentScore < bestScore) {
                    bestScore = currentScore;
                    bestPos = x;
                    if(bestScore == -10)
                        break;
                }
            }
            else if(turn == "O") {  //O is maximizer
                setO(x);
                currentScore = newminimax47(x)[0];
                if(i == 0) {
                    bestScore = currentScore;
                    bestPos = x;
                    if(bestScore == 10)
                        break;
                }

                else if(currentScore > bestScore) {
                    bestScore = currentScore;
                    bestPos = x;
                    if(bestScore == 10)
                        break;
                }
            }
            i++;
        }
    }
    revert(lastAdded);
    return new int [] {bestScore, bestPos};
}

Complementary functions used by newminimax47():

isGameOver():

public String isGameOver() {
    if(n == 3) {
        //Rows 1 to 3
        if((state[0] != "-") && (state[0] == state[1]) && (state[1] == state[2]))
            return (state[0] == "X") ? "X Won" : "O Won";
        else if((state[3] != "-") && (state[3] == state[4]) && (state[4] == state[5]))
            return (state[3] == "X") ? "X Won" : "O Won";
        else if((state[6] != "-") && (state[6] == state[7]) && (state[7] == state[8]))
            return (state[6] == "X") ? "X Won" : "O Won";

        //Columns 1 to 3
        else if((state[0] != "-")&&(state[0] == state[3]) && (state[3] == state[6]))
            return (state[0] == "X") ? "X Won" : "O Won";
        else if((state[1] != "-") && (state[1] == state[4]) && (state[4] == state[7]))
            return (state[1] == "X") ? "X Won" : "O Won";
        else if((state[2] != "-") && (state[2] == state[5]) && (state[5] == state[8]))
            return (state[2] == "X") ? "X Won" : "O Won";

        //Diagonals
        else if((state[0] != "-") && (state[0]==state[4]) && (state[4] == state[8]))
            return (state[0] == "X") ? "X Won" : "O Won";
        else if((state[6] != "-") && (state[6] == state[4]) && (state[4] == state[2]))
            return (state[6] == "X") ? "X Won" : "O Won";

        //Checking if draw
        else if((state[0] != "-") && (state[1]!="-") && (state[2] != "-") && (state[3]!="-") &&
                (state[4] != "-") && (state[5] != "-") && (state[6] != "-") && (state[7] != "-") &&
                (state[8] != "-"))
            return "Draw";
        else
            return "Not Gameover";
    }
    else {
        //Rows 1 to 4
        if((state[0] != "-") && (state[0] == state[1]) && (state[1] == state[2]) && (state[2] == state[3]))
            return (state[0] == "X") ? "X Won" : "O Won";
        else if((state[4] != "-") && (state[4] == state[5]) && (state[5]==state[6]) && (state[6] == state[7]))
            return (state[4] == "X") ? "X Won" : "O Won";
        else if((state[8] != "-") && (state[8] == state[9]) && (state[9]==state[10]) && (state[10] == state[11]))
            return (state[8] == "X") ? "X Won" : "O Won";
        else if((state[12] != "-") && (state[12] == state[13]) &&(state[13] == state[14]) && (state[14] == state[15]))
            return (state[12] == "X") ? "X Won" : "O Won";

        //Columns 1 to 4
        else if((state[0] != "-") && (state[0] == state[4]) && (state[4] == state[8]) && (state[8] == state[12]))
            return (state[0] == "X") ? "X Won" : "O Won";
        else if((state[1] != "-") && (state[1] == state[5]) && (state[5] == state[9]) && (state[9] == state[13]))
            return (state[1] == "X") ? "X Won" : "O Won";
        else if((state[2] != "-") && (state[2] == state[6]) && (state[6] == state[10]) && (state[10] == state[14]))
            return (state[2] == "X") ? "X Won" : "O Won";
        else if((state[3] != "-") && (state[3] == state[7]) && (state[7] == state[11]) && (state[11] == state[15]))
            return (state[3] == "X") ? "X Won" : "O Won";

        //Diagonale
        else if((state[0] != "-") && (state[0] == state[5]) && (state[5] == state[10]) && (state[10] == state[15]))
            return (state[0] == "X") ? "X Won" : "O Won";
        else if((state[12] != "-") && (state[12] == state[9]) &&  (state[9] == state[6]) && (state[6] == state[3]))
            return (state[0] == "X") ? "X Won" : "O Won";

        //Pruefe ob Gleichstand
        else if((state[0] != "-") && (state[1] != "-") && (state[2] != "-") && (state[3]!="-") &&
                (state[4] != "-") && (state[5] != "-") && (state[6] != "-") && (state[7] != "-") &&
                (state[8] != "-") && (state[9] != "-") && (state[10] != "-") && (state[11] != "-") &&
                (state[12] != "-") && (state[13] != "-") && (state[14] != "-") && (state[15] != "-")) 
            return "Draw";
        else
            return "Not Gameover";
    }   
}

Please excuse the bluntness of the isGameOver() method, it merely checks the state of the board( i.e. Win, Draw, Game not Over)

The getAvailableMoves() method:

public int[] getAvailableMoves() {
    int count = 0;
    int i = 0;
    for(int j = 0; j < state.length; j++) {
        if(state[j] == "-")
            count++;
    }
    int [] availableSlots = new int[count];
    for(int j = 0; j < state.length; j++){
        if(state[j] == "-")
            availableSlots[i++] = j;        
    }
    return availableSlots;
}

This method merely returns an int array with all available next moves(regarding the current state) or returns empty array if no moves are available or if game is over.

The score() method:

public int score() {
    if(isGameOver() == "X Won")
        return -10;
    else if(isGameOver() == "O Won")
        return +10;
    else 
        return 0;
}

setO(), setX() and revert():

public void setX(int i) {
    state[i] = "X";
    turn = "O";
    lastAdded = i;
}
public void setO(int i) {
    state[i] = "O";
    turn = "X";
    lastAdded = i;
}
public void revert(int i) {
    state[i] = "-";
    if(turn == "X")
        turn = "O";
    else
        turn = "X";
}

My main method looks like this for a 3x3 game:

public static void main(String args[]) {
    State s = new State(3);
    int [] ScoreAndRecommendedMove = new int[2];
    s.setX(8);
    ScoreAndRecommendedMove = s.newminimax47(8);
    System.out.println("Score: "+ScoreAndRecommendedMove[0]+" Position: "+ ScoreAndRecommendedMove[1]);
}

In this game, X has started the game with a move at position 8. The method in this case will return

Score: 0 Position: 4  

Meaning that O's most promising move is at position 4 and in the worst case will yield a score of 0(i.e a draw).

The following image is meant to give an idea of how newminimax47() works. In this case our starting state(board) is given the number 1. Note: The numbers indicate the precedence in creation of the regarded states. 1 was created before 2, 2 was created before 3, 3 was created before 4 and so on.enter image description here

In this scenario, the score and position eventually returned to state 1 will be

Score: 0 Position: 6

coming from state 8.

Note: The code you see is just snippets of the actual State class. These snippets on their own should allow you to recreate, and play around with, the newminimax47 function without problems(at least for 3x3). Any bugs you may find are not really bugs, they were simply not included in the snippets I copied here and the code should work without them. The lastAdded variable in the setO and setX functions for example is not included in the snippets here but I just realized you don't need it to be able to work with the minimax function, so you can just comment that out.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

I played around with your code and there is quite a bit to say

bug

first of all there is a bug. I don't think your code actually works for a 3x3 board. The problem is the location you revert the move you add to the board. You do this at the end of the newminimax47 method exactly once, even though in the method you add moves to the board inside a for loop. This means that calling the method does not only compute something but also changes the board state, and the rest of the code expects it not to.

So remove the revert where it is now and in revert as soon as you can:

setX(x);                                                                                                                                                                                                                                             
currentScore = newminimax47(x)[0];                           
revert(x);       

this also implies you don't need the lastAdded variable.

play

It's a lot easier to see what is happening if you actually play against your own algorithm. Add a method to your State class

public void dump() {                                                        
    for (int y = 0; y < n; y++) {                                           
        for (int x = 0; x < n; x++) {                                       
            System.out.print(state[y * n + x]);                             
        }                                                                   
        System.out.println();                                               
    }                                                                       
}

and in you main something like

public void play() {                                                        
    State s=new State(3);                                                   
    Scanner in = new Scanner (System.in);                                   
    while (s.isGameOver().equals("Not Gameover")) {                         
        int[] options = s.getAvailableMoves();                              
        s.dump();                                                           
        System.out.println ("Your options are " + Arrays.toString(options));
        int move = in.nextInt();                                            
        s.setX(move);                                                       
        int [] ScoreAndRecommendedMove=new int[2];                          
        ScoreAndRecommendedMove=s.newminimax47(0);                           
        System.out.println("Score: "+ScoreAndRecommendedMove[0]+" Position: "+ ScoreAndRecommendedMove[1]);
        s.setO(ScoreAndRecommendedMove[1]);                                 
    }                                                                       
    s.dump();                                                               
}

and you can actually play against it. On a 3x3 board this works just fine for me. Unfortunately I estimated computing the first move on a 4x4 takes my computer approximately 48 hours.

data types

Your choice of data types is often a bit... strange. If you want to remember a single character, use char instead of String. If you want to return a yes/no decision, try use boolean. There are also some parts of the program that could be replaces by less code doing the same. But that wasn't your question, so on to...

algorithm

Ok, so what's wrong with minimax to solve this problem? Suppose the first four moves are X5, O8, X6 O7. Another possibility is to start the game with X5, O7, X6, O8. Yet another one is X6, O7, X5, O8. And finally there's X6, O8, X5, O7.

All four of those possibilities for the first four moves of the game lead to the exact same gamestate. But minimax will not recognise they're the same (basically there is no memory of parallel branches) so it will compute them all four. And the number of times each board state is computed will increase rapidly if you search deeper.

The number of possible games vastly outnumbers the number of possible board states. To estimate the number of games: at first there are 16 possible moves, then 15, then 14, 13, ... and so on. A rough estimation is 16!, although minimax won't have to compute all of them, because many of them will have finished before the 16th move.

An estimation for the number of game states is: every square on the board can be either empty, or an X, or an O. So that's 3^16 boards. Not all of them are actually valid boards, because the number of Xs on the board can be at most one more then the number of Os, but still it's close to 3^16.

16! possible games is about half a million times more then 3^16 possible board states. Which means we're approximately computing every board half a milion times in stead of just once.

The solution is to start remembering every gamestate you compute. Every time the recursive function is called first check if you already know the answer, and if so just return the old answer. It's a technique called memoization.

Memoization

I'll describe how to add memoization while using the data structures you already choose (even though I don't agree with them). To do memoization you need a collection on which you can do quick adds and quick lookups. A list (for example ArrayList) won't do us any good. It's fast to add values but to do a lookup is very slow in long lists. There are some options but the easiest one to use is HashMap. In order to use HashMap you need to create something that represents your state and you can use as a key. The most straight-forward is to just make a String with all the X/O/- symbols in it that represent your board.

So add

Map<String,int[]> oldAnswers = new HashMap<String,int[]>();                                                                  

to your State object.

Then at the start of your newminimax47 method create the String that represents the State and check if we already know the answer:

    String stateString = "";                                                
    for (String field : state) stateString += field;                        
    int[] oldAnswer = oldAnswers.get(stateString);                          
    if (oldAnswer != null) return oldAnswer;

Finally when you compute a new answer the end of newminimax47 you should not only return it, but also store it in the map:

    int[] answer = {bestScore, bestPos};                                    
    oldAnswers.put (stateString, answer);                                   
    return answer;

With memoization in place I was able to play a 4x4 game against your code. The first move is still slow (20 seconds) but after that everything is computed and it's very fast. If you want to speed it up further, you could look into alpha beta pruning. But the improvement won't be near as much as memoization. Another option is using more efficient data types. It won't reduce the theoretical order of your algorithm, but could still easily make it 5 times faster.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...