I have recently being try to make a tic tac toe game using the mini max algorithm. I first created a board, then two player. Afterwards, I changed one of the players into the algorithm. I tried using something similar to this javascript implementation. I am not getting an syntax error, just the algorithm is not working.
For example, take the following game sequence.
The algorithm starts the game and places an "X" at the top right of the board or in index[0].
I, the player place "O" in the top right of the board or in index[2].
The algorithm places an "X" at the top center of the board or in index [1].
I, the player place "O" in the middle center of the board or in index[4].
The algorithm places an "X" at the middle left of the board or in index [3].
The error is that rather than stopping the win, the algorithm is playing the next free position.
In the min-max algorithm, this is a minimizing agent (seeks the least score) and the maximizing agent (the higher score) (in this case the AI). Below is the code, can you help me find the issue? Or how should I go along? Because I have been trying for the last two days. Hopefully the explanation above made sense.
board = ["."] * 9
winning_comb = [[0,1,2],[3,4,5],[6,7,8],[0,3,6],[1,4,7],[2,5,8],[3,4,6],[0,4,8]]
game = True
def new_board():
print (board[0] + "|" + board[1] + "|" + board[2])
print (board[3] + "|" + board[4] + "|" + board[5])
print (board[6] + "|" + board[7] + "|" + board[8])
new_board()
def winning(comb):
global game
for l in range(len(winning_comb)):
a = winning_comb[l][0]
f = winning_comb[l][1]
v = winning_comb[l][2]
if comb[a] == comb[f] == comb[v] == "O" or "x" == comb[a] == comb[f] == comb[v]:
game = False
if comb[a] == "x":
return 1
else:
return -1
break
else:
game = True
def minmax(board,depth, ismax):
if winning(board) != None:
h = winning(board)
return h
else:
if ismax == True:
bestscore = float('-inf')
for k in range(len(board)):
if board[k] == ".":
board[k] = "x"
score = minmax(board,depth+1,False)
board[k] = '.'
bestscore = max(bestscore, score)
return bestscore
else:
bestscore = float('inf')
for k in range(len(board)):
if board[k] == ".":
board[k] = "O"
score = minmax(board,depth+1,True)
board[k] = '.'
bestscore = min(bestscore,score)
return bestscore
def player1() :
bestscore = float('-inf')
bestmove = 0
for k in range(len(board)):
if board[k] == ".":
board[k] = "x"
score = minmax(board, 0, False)
board[k] = "."
if score > bestscore:
bestscore = score
bestmove = k
board[bestmove] = "x"
new_board()
def player2():
number = int(input("Please enter your poistion?") )
board[number - 1 ] = "O"
new_board()
winning(board)
while game==True:
player1()
player2()
[3,4,6] shouldn't be a winning combination. It should be [2,4,6]
Some issues:
[3,4,6] should be [2,4,6]
Indentation of return bestscore is wrong in the second instance: currently it interrupts the loop.
If you backtrack after game = False is executed, that assignment should be undone. For that reason it is probably easier to not use that variable at all, and just call the function winning when needed. The main loop could then be:
while winning(board) is None:
player1()
if winning(board) is not None: # need to exit
break
player2()
bestscore will be (-)infinity when there is no more free cell in the grid, and the game really is a draw. In that case bestscore should be 0, so this draw is considered better than a loss, and worse than a win. So make sure winning does not return None in that case, but 0:
if not "." in comb:
return 0
Not an error, but it is a bit odd that some functions take the board as argument, and others not. Make this consistent, and always pass the board as argument (player1(board), player2(board) and winning(board)).
With these changes, the AI will play the best play, although the calculation for the first move takes quite some time. You could improve by applying alpha-beta pruning.
Related
I've been working on a MCTS AI for a couple days now. I tried to implement it on Tic-Tac-Toe, the least complex game I could think of, but for some reason, my AI keeps making bad decisions. I've tried change the values of UCB1's exploration constant, the number of iterations per search, and even the points awarded to winning, losing, and getting to tie the game (trying to make a tie more rewarding, as this AI only plays second, and try to get a draw, win otherwise). As of now, the code looks like this:
import random
import math
import copy
class tree:
def __init__(self, board):
self.board = board
self.visits = 0
self.score = 0
self.children = []
class mcts:
def search(self, mx, player,):
root = tree(mx)
for i in range(1200):
leaf = mcts.expand(self, root.board, player, root)
result = mcts.rollout(self, leaf)
mcts.backpropagate(self, leaf, root, result)
return mcts.best_child(self, root).board
def expand(self, mx, player, root):
plays = mcts.generate_states(self, mx, player) #all possible plays
if root.visits == 0:
for j in plays:
root.children.append(j) #create child_nodes in case they havent been created yet
for j in root.children:
if j.visits == 0:
return j #first iterations of the loop
for j in plays:
if mcts.final(self, j.board, player):
return j
return mcts.best_child(self, root) #choose the one with most potential
def rollout(self, leaf):
mx = leaf.board
aux = 1
while mcts.final(self, mx, "O") != True:
if aux == 1: # "X" playing
possible_states = []
possible_nodes = mcts.generate_states(self, mx, "X")
for i in possible_nodes:
possible_states.append(i.board)
if len(possible_states) == 1: mx = possible_states[0]
else:
choice = random.randrange(0, len(possible_states) - 1)
mx = possible_states[choice]
if mcts.final(self, mx, "X"): #The play by "X" finished the game
break
elif aux == 0: # "O" playing
possible_states = []
possible_nodes = mcts.generate_states(self, mx, "O")
for i in possible_nodes:
possible_states.append(i.board)
if len(possible_states) == 1: mx = possible_states[0]
else:
choice = random.randrange(0, len(possible_states) - 1)
mx = possible_states[choice]
aux += 1
aux = aux%2
if mcts.final(self, mx, "X"):
for i in range(len(mx)):
for k in range(len(mx[i])):
if mx[i][k] == "-":
return -1 #loss
return 0 #tie
elif mcts.final(self, mx, "O"):
for i in range(len(mx)):
for k in range(len(mx[i])):
if mx[i][k] == "-":
return 1 #win
def backpropagate(self, leaf, root, result): # updating our prospects stats
leaf.score += result
leaf.visits += 1
root.visits += 1
def generate_states(self, mx, player):
possible_states = [] #generate child_nodes
for i in range(len(mx)):
for k in range(len(mx[i])):
if mx[i][k] == "-":
option = copy.deepcopy(mx)
option[i][k] = player
child_node = tree(option)
possible_states.append(child_node)
return possible_states
def final(self,mx, player): #check if game is won
possible_draw = True
win = False
for i in mx: #lines
if i == [player, player, player]:
win = True
possible_draw = False
if mx[0][0] == player: #diagonals
if mx[1][1] == player:
if mx[2][2] == player:
win = True
possible_draw = False
if mx[0][2] == player:
if mx[1][1] == player:
if mx[2][0] == player:
win = True
possible_draw = False
for i in range(3): #columns
if mx[0][i] == player and mx[1][i] == player and mx[2][i] == player:
win = True
possible_draw = False
for i in range(3):
for k in range(3):
if mx[i][k] == "-":
possible_draw = False
if possible_draw:
return possible_draw
return win
def calculate_score(self, score, child_visits, parent_visits, c): #UCB1
return score / child_visits + c * math.sqrt(math.log(parent_visits) / child_visits)
def best_child(self, root): #returns most promising node
treshold = -1*10**6
for j in root.children:
potential = mcts.calculate_score(self, j.score, j.visits, root.visits, 2)
if potential > treshold:
win_choice = j
treshold = potential
return win_choice
#todo the AI takes too long for each play, optimize that by finding the optimal approach in the rollout phase
First off, the purpose of this AI is to return an altered matrix, with the best play he could make in that circunstance. I find myself questioning if the MCTS algorithm is the reason behind all these broken plays, due to some possible mistakes in its implementation. With that said, in my eyes, the code does the following:
Check if the root already has its children, in case it has, choose the most promising.
Rollout a random simulation and save the result.
Update the leaf's score, its number of visits and the root's number of visits.
Repeat for 1200 iterations, in my example
Return the best move (matrix, child_node) possible.
Why is it not working? Why is it choosing bad plays instead of the optimal one? Is the algorithm wrongly implemented?
My mistake was choosing the node with the most visits in the expansion phase, when it should have been the one with the most potential according to the UCB1 formula. I also had some errors when it came to implementing some if clauses, as all the losses weren't being counted.
Im making a Tic Tac Toe game but I have met a couple of problems while doing it.
The firs problem, is that when I had the board like:
X | - | X
or like:
- | X | X
It counts as a Win and I don't know why. If I put:
X | X | -
there's no problem with that.
The second problem, is that I want to prevent the user to select a position that is already in use, so my idea is to add to a list the position that has been selected and compare if it is already in it or not. The problem is that when I call the function for the second time it doesn't have the previus list that contain the information of the previus position (it makes a new one), therefore it never detects that the position that the user is giving is already in use.
board = ["-","-","-",
"-","-","-",
"-","-","-"]
def display():
print(board[0] + " | " + board[1] + " | " + board[2])
print(board[3] + " | " + board[4] + " | " + board[5])
print(board[6] + " | " + board[7] + " | " + board[8])
def turn():
exist=[]
pos = int(input("Choose your position"))
while True:
if pos not in exist:
exist.append(pos)
print(exist)
print("Correct position")
break
else:
print("That position has been selected, choose another one")
board[pos-1] = "X"
display()
def checkwin():
#row
if (board[0] and board[1] and board[2] == "X") or (board[3] and board[4] and board[5] == "X") or (board[6] and board[7] and board[8] == "X"):
print("Victory")
return False
else:
return True
display()
while checkwin():
turn()
checkwin()
print(board)
print("Game has ended")
NOTE: Right now the game only can have the X player, I need still to add the O player. Also, it only can detect the win for the rows. Im still working on this.
To your second question, you are declaring your list inside the function turn, which is called every time a player makes a choice. When a function returns, all of its local variables are no longer accessible. Compare this with your code.
def addToList( item ):
items = []
items.append( item )
addToList( 'item1' )
addToList( 'item2' )
Each time the function is called, items = [] is called as well. Edit to clarify my first statement, functions create their own scope. The scope of a variable is where it is available in your program. For example
def addItem( item ):
items = []
items.append( item )
addItem( "item1" )
print( items )
ERROR: items is not defined
The logic of (board[0] and board[1] and board[2] == "X") won't work because you are only checking is board[2] == 'X', the others are just truthy, thus '-' will also evaluate to True
You should rather check like this (board[0] == board[1] == board[2] == "X").
This ensures that all values are equal to 'X'
sin tribu nailed it as to why your turn function is behaving the way it is.
As to how to make it work right... perhaps I'm missing something, but aren't you looking to make sure you can't select a space that's already occupied? For that, you could just check the board directly.
def turn():
pos = int(input("Choose your position"))
while board[pos-1] == "X":
pos = int(input("That position has been selected, choose another one"))
print("Correct position")
board[pos-1] = "X"
display()
This will keep asking until you specify an empty square.
A few days ago a started making my simple board game. First of all, I generate a board for the game. It looks like this:
the gameboard generated for 13x13
Secondly, I place my character on the board, which is 'A':
The player placed
I made a dice for it which generates numbers from 1 to 6.
My goal right now is to get the 'A' character moving around by the dice on the '*' symbols, until it gets at the top left corner:
I need to get here by the dice
So here is my code that I tried:
import math
import random
import os
board= []
def generator(boardsize):
for row in range(boardsize+1):
brow = []
for column in range(boardsize+1):
if row == column == 0:
brow.append(' ')
elif row==0:
brow.append(str(column-1)[-1])
elif column==0:
brow.append(str(row-1)[-1])
elif ((math.ceil(boardsize/2)-1 )<= column) and(column <= (math.ceil(boardsize/2)+1)) or ((math.ceil(boardsize/2)-1 )<= row) and(row <= (math.ceil(boardsize/2)+1)):
if row == 1 or column == 1 or row == boardsize or column == boardsize:
brow.append('*')
else:
if row == (math.ceil(boardsize/2)) and column == (math.ceil(boardsize/2)):
brow.append('X')
elif row == (math.ceil(boardsize/2)) or column == (math.ceil(boardsize/2)):
brow.append('D')
else:
brow.append('*')
else:
brow.append(' ')
board.append(brow)
return board
def print_table(x):
os.system('cls')
for x in board:
print(' '.join(x))
number_from_dice= []
def dice():
min = 1
max = 6
x = random.randint(min, max)
number_from_dice[:]= [x]
return number_from_dice
def player1(x):
generator(x)
prev_char_y = 1
prev_char_x = math.ceil(x/2)+1
char_y= 1
char_x= math.ceil(x/2)+1
board[char_y][char_x] = "A"
print_table(x)
dice()
f = number_from_dice[0]
for i in range(f):
if(char_y<x):
if (board[char_y+1][char_x]) == '*':
char_y= char_y +1
board[char_y][char_x] = "A"
board[prev_char_y][prev_char_x] = '*'
prev_char_x = char_x
prev_char_y = char_y
print_table(x)
else:
if(char_x!=x):
char_x2 = char_x
if (board[char_y][char_x+1]=='*'):
char_x = char_x +1
board[char_y][char_x] = "A"
board[prev_char_y][prev_char_x] = '*'
prev_char_x = char_x
prev_char_y = char_y
print_table(x)
else:
if (board[char_y+1][char_x]) == '*':
char_y= char_y +1
board[char_y][char_x] = "A"
board[prev_char_y][prev_char_x] = '*'
prev_char_x = char_x
prev_char_y = char_y
print_table(x)
else:
if (board[char_y][char_x2-1]) == '*':
char_x2 = char_x2 -1
board[char_y][char_x2] = "A"
board[prev_char_y][prev_char_x] = '*'
prev_char_x = char_x2
prev_char_y = char_y
print_table(x)
else:
if (board[char_y+1][char_x2]) == '*':
char_y = char_y +1
board[char_y][char_x2] = "A"
board[prev_char_y][prev_char_x] = '*'
prev_char_x = char_x2
prev_char_y = char_y
print_table(x)
print('Number from dice: ', end='')
print(f)
player1(13)
Does the technic I used have potential? Or is it too complicated? How would you do it?
Just in a generic sense you've made it overly complicated.
Consider this - the board, as far as movement is concerned, is just a set of ordered spaces.
But right now you have information about how the board is created as part of the player code.
Best to separate this, and you will find that things get simpler.
Instead, have the player simply track it's progress, in other words, what numbered space is it on.
Then you can generate the board and, knowing the space numbers, you can see if it matches the player location.
And then take it one step further (and simpler still) and just draw the board on a 2D array, and then output that, instead of trying to figure out the board as you go line-by-line.
I have an excercise to do and I'm stuck. It's the board game Alak, not much known, that I have to code in python. I can link the execrcise with the rules so you can help me better. The code has the main part and the library with all the procedures and function.
from Library_alak import *
n = 0
while n < 1:
n = int(input('Saisir nombre de case strictement positif : '))
loop = True
player = 1
player2 = 2
removed = [-1]
board = newboard(n)
display(board, n)
while loop:
i = select(board, n, player, removed)
print(i)
board = put(board, player, i)
display(board, n)
capture(board, n, player, player2)
loop = True if again(board, n, player, removed) is True else False
if player == 1 and loop:
player, player2 = 2, 1
elif player == 2 and loop:
player, player2 = 1, 2
win(board, n)
print(win(board, n))
And here is the library:
def newboard(n):
board = ([0] * n)
return board
def display(board, n):
for i in range(n):
if board[i] == 1:
print('X', end=' ')
elif board[i] == 2:
print('O', end=' ')
else:
print(' . ', end=' ')
def capture(board, n, player, player2):
for place in range(n):
if place == player:
place_beginning = place
while board[place] != player:
place_end = place
if board[place + x] == player:
return board
else:
return board
def again(board, n, player, removed):
for p in board(0):
if p == 0:
if p not in removed:
return True
else:
return False
def possible(n, removed, player, i, board):
for p in range(n + 1):
if p == 1:
if board[p-1] == 0:
if p not in removed:
return True
else:
return False
def win(board, n):
piecesp1 = 0
piecesp2 = 0
for i in board(0):
if i == 1:
piecesp1 += 1
else:
piecesp2 += 1
if piecesp1 > piecesp2:
print('Victory : Player 1')
elif piecesp2 > piecesp1:
print('Victory : Player 2')
else:
return 'Equality'
def select(board, n, player, removed):
loop = True
while loop:
print('player', player)
i = int(input('Enter number of boxes : '))
loop = False if possible(n, removed, player, i, board)is True else True
return i
def put(board, player, i):
i -= 1
if board[i] == 0:
if player == 1:
board[i] = 1
return board
else:
board[i] = 2
return board
else:
put(board, player, i)
So my problems here are that I have few errors, the first one is that when I enter the number '1' when asked to enter a number of boxes ( which is the place to play on ) nothing happens. Then when entering any other number, either the error is : if board[place + x] == player:
NameError: name 'x' is not defined
or there seems to be a problem with the : if board[place + x] == player:
NameError: name 'x' is not defined
I would appreciate a lot if anyone could help me. I'm conscious that it might not be as detailed as it should be and that you maybe don't get it all but you can contact me for more.
Rules of the Alak game:
Black and white take turns placing stones on the line. Unlike Go, this placement is compulsory if a move is available; if no move is possible, the game is over.
No stone may be placed in a location occupied by another stone, or in a location where a stone of your own colour has just been removed. The latter condition keeps the game from entering a neverending loop of stone placement and capture, known in Go as ko.
If placing a stone causes one or two groups of enemy stones to no longer have any adjacent empty spaces--liberties, as in Go--then those stones are removed. As the above rule states, the opponent may not play in those locations on their following turn.
If placing a stone causes one or two groups of your own colour to no longer have any liberties, the stones are not suicided, but instead are safe and not removed from play.
You shouldn't use "player2" as a variable, there's an easier way, just use "player" which take the value 1 or 2 according to the player. You know, something like that : player = 1 if x%2==0 else 2
and x is just a increasing int from 0 until the end of the game.
Can anyone tell me why my code prints 1 and not 8? It seems to not be going through very single state. Why is that?
using the minimax algorithm find the best possible move to make based on a game state, a possible tic tac toe board. Usually, it would branch off into a large tree of game states, each new branch called when the game doesn't end on an ending state, repeated, then finding the best possible move by recursively going down the tree finding the best moves for each player.
I was following the "tutorial" at http://giocc.com/concise-implementation-of-minimax-through-higher-order-functions.html.
My code:
#!/usr/bin/env python3
'''Minimax finds the best possible moves by applying a set of rules.
A win = 1, tie = 0, loss = -1 (for us). Assuming that each player chooses the best move
(we choose 1 if possible, opponent chooses -1). Starting at the top of a 'game tree',
generate the possible moves we can make. If It reaches a terminal state, stop. Otherwise keep searching in depth.
We find max.
'''
#[0,1,2,3,4,5,6,7,8]
class GameState: #a game state is a certain state of the board
#http://stackoverflow.com/questions/1537202/variables-inside-and-outside-of-a-class-init-function
x_went_first = True
def __init__(self,board):
self.board = board
self.winning_combos = [[0,1,2],[3,4,5],[6,7,8],[0,3,6],[1,4,7],[2,5,8],[0,4,8],[2,4,8]]
def is_gameover(self):
if self.board.count('X') + self.board.count('O') == 9:
return True
for combo in self.winning_combos:
if (self.board[combo[0]] == 'X' and self.board[combo[1]] == 'X' and self.board[combo[2]] == 'X') or (self.board[combo[0]] == 'O' and self.board[combo[1]] == 'O' and self.board[combo[2]] == 'O'):
return True
return False
def get_possible_moves(self):
squares = []
for square in self.board:
if square != 'X' and square != 'O':
squares.append(int(square))
return squares
def get_next_state(self, move):
copy = self.board
num_of_x = copy.count('X')
num_of_o = copy.count('O')
#x starts, o's turn 1 > 0 o's turn
#o starts, x's turn 1 < 0 x's turn
#x starts, x's turn 1 > 1
#o starts, o's turn 1 < 1
if (self.x_went_first and num_of_x > num_of_o) or (self.x_went_first is not True and num_of_o == num_of_x):
copy[move] = 'O'
else:
copy[move] = 'X'
return GameState(copy)
def evals(game_state):
for combo in [[0,1,2],[3,4,5],[6,7,8],[0,3,6],[1,4,7],[2,5,8],[0,4,8],[2,4,8]]:
if game_state.board[0] == 'X' and game_state.board[1] == 'X' and game_state.board[2] == 'X':
return 1
elif game_state.board[0] == 'O' and game_state.board[1] == 'O' and game_state.board[2] == 'O':
return -1
else:
return 0
def min_play(game_state):
if game_state.is_gameover():
return evals(game_state)
moves = game_state.get_possible_moves()
best_move = moves[0]
best_score = 2 #not possible, best score is -1
for move in moves:
clone = game_state.get_next_state(move)
score = max_play(clone)
if score < best_score:
best_move = move
best_score = score
return best_score
def max_play(game_state):
if game_state.is_gameover():
return evals(game_state)
moves = game_state.get_possible_moves()
best_score = -2 #not possible, best score is 1
for move in moves:
clone = game_state.get_next_state(move)
score = min_play(clone)
if score > best_score:
best_move = move
best_score = score
return best_score
def minimax(game_state):
moves = game_state.get_possible_moves()
best_move = moves[0]
best_score = -2
for move in moves:
clone = game_state.get_next_state(move)
score = min_play(clone)
if score > best_score:
best_move = move
best_score = score
return best_move
game = GameState(['X',1,2,
3,'O',5,
6,7,8])
print(minimax(game))
My evals was always returning 0, and one of the winning combinations was messed up. New evals:
def evals:
for combo in [[0,1,2],[3,4,5],[6,7,8],[0,3,6],[1,4,7],[2,5,8],[0,4,8],[2,4,6]]:
if game_state.board[0] == 'X' and game_state.board[1] == 'X' and game_state.board[2] == 'X':
return 1
elif game_state.board[0] == 'O' and game_state.board[1] == 'O' and game_state.board[2] == 'O':
return -1
return 0
I also modified it so the index isn't in every empty slot. View the full code at https://github.com/retep-mathwizard/pyai/blob/master/minimax_ttt