Pick Up Sticks / Intelligent AI / Python

Pick Up Sticks / Intelligent AI / Python - python

BACKGROUND INFO, DON'T HAVE TO READ IF YOU'D JUST LIKE TO VIEW THE PROBLEM WITH CODE BELOW:
I hope everyone is familiar with the game of sticks or "nim." If not, you set a starting amount of sticks (between 10 and 50) and draw (1-3 sticks) until there aren't any sticks left, declaring the one who pulled the last stick the loser. In my programming class we've also included the option of playing against the AI. But, the AI is no longer a dummy who randomly picks a number 1-3. Now he learns from each of his turns.
Implementation:
The AI has a bucket for each of the number of sticks left. There is a bucket for 1 stick
left, 2 sticks, 3 sticks, etc.
At the beginning of the game each bucket has 3 balls in it. Each marked with the choice
1, 2 or 3. These represent the AI’s choice of picking up 1, 2 or 3 sticks.
During the AI’s turn, it takes a random ball from the bucket representing the number of
sticks left. It reads the ball and removes that number of sticks from the pile. It then
places the ball in front of the bucket.
If the AI wins the game, then it goes through all of its choices and puts two balls back for
the chosen number for each choice it made. Increasing its chances of choosing that ball
the next time it’s faced with a choice with the given number of sticks.
If the AI loses, then it throws the ball away next to the buckets. However, if the chosen
ball is the last choice then it puts it back into the bucket. The bucket must contain at
least one of each number. So if the user chose a ball that had a number of sticks to
pick from a bucket, and it was the last ball of that choice, then if the AI loses, it must put
that ball back. It can never remove any of the choices completely from the buckets.
As more games are played the AI will reinforce good choices with extra balls for winning
sticks picked up."
Here's the code I'm working with right now.
choice=random.randint(1,maxchoice) #computer picks a random number
bucketnum=sticks #bucket to take ball from
pullnum=choice #ball to take
for i in bucket:
for bucket[bucketnum] in i:
bucketnum.pop(pullnum)
print(bucket[bucketnum])
The bucket that I'd be taking the ball out of would essentially be the number of sticks left, I'm just having trouble finding a specific bucket in the bucket list and taking out the ball. Right now I get an error message saying that bucketnum.pop(pullnum) - 'int' object has no attribute to 'pop'? This is the bucket code (lists within a list):
bucket=[]
for j in range(51):
bucket.append([1,2,3])
I may be totally confusing but if anybody has any advice or even questions for clarification, please do reply. Thanks all.
EDIT:
Here's some more code, sorry, stupid of me to refrain from adding the definitions of variables, etc.
if option==2:
sticks=""
while True:
try:
sticks=int(input("Enter the number of sticks to begin with: "))
if sticks>=10 and sticks<=50:
print("Alright, there are",sticks,"sticks in the pile.")
break
else:
print("You mut enter an integer. (10-50)")
except ValueError:
print("You must enter an integer.")
player1=True
while sticks>0:
maxchoice=min(3,sticks)
choice=-1
countbucket=0
if player1:
while choice<1 or choice>maxchoice:
try:
choice=int(input("Player 1, how many sticks would you like to take? (1-3): "))
if choice>=1 and choice<=3:
sticks-=choice
print("There are",sticks,"sticks remaining.")
else:
print("You must enter an integer from 1-3.")
except ValueError:
print("You must enter an integer.")
player1=not player1
else:
choice=random.randint(1,maxchoice)
bucketnum=sticks
pullnum=choice
for i in bucket:
for bucket[bucketnum] in i:
bucketnum.pop(pullnum)
print(bucket[bucketnum])
sticks-=1
print("Computer drew",choice,"stick(s). There are",sticks,"sticks remaining.")
player1=not player1
if player1==False:
print("Player 1 took the last stick.\nComputer wins!")
else:
print("Player 1 wins!")
This is option 2 in my program, as option 1 is Player 1 vs. Player 2. Obviously I haven't gotten very far with the implementation of the AI intelligence, it's a bit tricky.
-----> Fred S., I'm just getting started and having issues getting the mental wheel spinning. What's excerpted isn't all of the code. I'm not asking how to complete the assignment at this point, though tips on executing this new intelligent AI code would be helpful, but in this case it's more a focus on figuring out list indexing.

It looks like you're assigning the variable in the inner for loop to 'bucket[bucketnum]'. Which surprises me that that's not a syntax error, but I don't think that's what you're trying to actually do.
If you're dealing with a nested list, and the position in the list corresponds to the number of sticks left, then you want to index that list by the position in order to get that bucket, instead of iterating over that list to find it.
If you think of it like this:
buckets = [[1,2,3], ..., ..., ...]
Then the bucketnum is the position of the bucket in the list of buckets. So, in your case, if you want to grab the bucket for '26' sticks, you would access it by indexing buckets by that number.
buckets[25] # 25 since you're counting from 0+
At this point, you have the bucket in question, and can pop the choice from it.
bucket = buckets[25]
bucket.pop(pullnum)

You didn't define option
You didn't import the random library.

Related

Why is my answer wrong in Code jam 2018 "Saving the World Again"?

The problem is presented here: https://codingcompetitions.withgoogle.com/codejam/round/00000000000000cb/0000000000007966
An alien robot is threatening the universe, using a beam that will destroy all algorithms knowledge. We have to stop it!
Fortunately, we understand how the robot works. It starts off with a beam with a strength of 1, and it will run a program that is a series of instructions, which will be executed one at a time, in left to right order. Each instruction is of one of the following two types:
C (for "charge"): Double the beam's strength.
S (for "shoot"): Shoot the beam, doing damage equal to the beam's current strength.
For example, if the robot's program is SCCSSC, the robot will do the following when the program runs:
Shoot the beam, doing 1 damage.
Charge the beam, doubling the beam's strength to 2.
Charge the beam, doubling the beam's strength to 4.
Shoot the beam, doing 4 damage.
Shoot the beam, doing 4 damage.
Charge the beam, increasing the beam's strength to 8.
In that case, the program would do a total of 9 damage.
The universe's top algorithmists have developed a shield that can withstand a maximum total of D damage. But the robot's current program might do more damage than that when it runs.
The President of the Universe has volunteered to fly into space to hack the robot's program before the robot runs it. The only way the President can hack (without the robot noticing) is by swapping two adjacent instructions. For example, the President could hack the above program once by swapping the third and fourth instructions to make it SCSCSC. This would reduce the total damage to 7. Then, for example, the president could hack the program again to make it SCSSCC, reducing the damage to 5, and so on.
To prevent the robot from getting too suspicious, the President does not want to hack too many times. What is this smallest possible number of hacks which will ensure that the program does no more than D total damage, if it is possible to do so?
Input
The first line of the input gives the number of test cases, T. T test cases follow. Each consists of one line containing an integer D and a string P: the maximum total damage our shield can withstand, and the robot's program.
Output
For each test case, output one line containing Case #x: y, where x is the test case number (starting from 1) and y is either the minimum number of hacks needed to accomplish the goal, or IMPOSSIBLE if it is not possible.
I implemented the following logic:
- First calculate the damage of the ship.
- The S has it's greatest value when it is at the end so the swaps should start at end and continue towards the beginning of the list.
- The C at the end becomes useless so I pop it out of the list so it does not iterate over it again.
- In order to simplify the O() complexity I decided to subtract the last value of S from theSUM every time a swap is made.
The test results seem right - but the judge of the system says : Wrong Answer.
Can you help me find the mistake?
(I know only how to operate with lists and dictionaries in Python 3 and I am an absolute beginner at solving theese questions )
my code is below:
for case in range(1,T):
D, B = input().split()
D = int(D)
Blist =[]
[Blist.append(i) for i in B]
def beamDamage(Blist):
theSum=0
intS=1
Ccount = 0
for i in Blist:
if i == 'S':
theSum = theSum + intS
if i == 'C':
Ccount = Ccount +1
intS = intS*2
return theSum
def swap(Blist):
temp=''
for i in range(0,len(Blist)):
if Blist[len(Blist)- 1] == 'C':
Blist.pop()
if (Blist[len(Blist)- i - 1]) == 'C' and (Blist[len(Blist)- i] == 'S'):
temp = Blist[len(Blist)- i - 1] # C
Blist[len(Blist)- i - 1] = 'S'
Blist[len(Blist)- i] = temp
return Blist
bd = beamDamage(Blist)
y = 0
if 'C' not in B:
if beamDamage(Blist) > D:
print("Case #{}: IMPOSSIBLE".format(case))
else:
print("Case #{}: 0".format(case))
else:
while bd > D:
swap(Blist)
pwr=0
for ch in Blist:
if ch == 'C':
pwr=pwr+1
bd = bd - 2**(pwr-1)
y+=1
print("Case #{}: {}".format(case, y))

I will not give you a complete solution, but here is one issue:
If your input is a series of "S" followed by one or more "C" (like "SSSSC"), and the calculated damage is higher than asked for, you'll clearly see that the result is wrong. It should be IMPOSSIBLE...
The reason for the failure is that the condition in if 'C' not in B: will not apply, and so the loop will kick in (when it really shouldn't). Consequently pwr remains zero and you use a calculation with 2**-1, which yields a non-integer value.
The solution is to trim the list from terminating C characters at the very start, even before doing the if test.
Secondly, I don't see the benefit of doing the damage calculation in two different ways. On the one hand you have beamDamage, and you also have the inline loop, which does roughly the same (not faster).
Finally, even if you get this right, I suspect your code might run into a timeout, because it is not doing the job efficiently. Think of keeping track of the damage incrementally, without needing to go through the whole list again.
Once you have that improvement, you may still need to tune performance furhter. In that case, think of what damage reduction you would get it you would move a "C" immediately to the very end of the list. If that reduction is still not bringing the damage below the target, you can go for that in one go (but still count the steps correctly).

Explanation on Stone Nim Game

I was doing a coding problem which I somehow passed all test cases but I did not understand exactly what was going on. The problem was a small twist on the classic nim game:
There are two players A and B. There are N piles of various stones. Each player can take any amount of stones if the pile is less than K, otherwise they must take a multiple of K stones. The last person to take stones wins.
python
# solution -> will A win the game of piles, k?
def solution(piles, k):
gn = 0 # Grundy number
for pile in piles:
if pile % 2 != 0:
gn ^= pile + 1
else:
gn ^= pile - 1
return gn != 0
I'm not sure if there was enough test cases, but k was not even used here. To be honest, I am having a difficult time even understanding what gn (Grundy number) really means. I realize there is a proof of winning the Nim game if the xor of all piles is not zero, but I don't really understand why this variation requires checking the parity of the pile.

First, the given solution is incorrect. You noticed that it does not use k, and indeed this is a big red flag. You can also look at the result it gives for a single pile game, where it seems to say that player A only wins if the size of the pile is one which you should fairly quickly be able to show is incorrect.
The structure of the answer is sort of correct, though. A lot of the power of the Grundy number is that the Grundy number of a combined game state is the nim sum (XOR in the case of finite ordinals) of the Grundy numbers of the individual game states. (This only works for a very specific way of combining game states, but this turns out to be the natural way of considering Nim piles together.) So, this problem can indeed be solved by finding the Grundy number for each pile (considering k) and XOR-ing them together to get the Grundy number for the full game state. (In Nim where you can take any number of stones from a pile and win by taking the last stone, the Grundy number of a pile is just the size of a pile. That's why the solution to that version of Nim just XOR-s the sizes of the piles.)
So, taking the theory for granted, you can solve the problem by finding the correct Grundy values for a single pile given k. You only need to consider one pile games to do this. This is actually a pretty classic problem, and IMO significantly simpler to correctly analyze than multi-pile Nim. You should give it a go.
As for how to think of Grundy numbers, there are plenty of places to read about it, but here's my approach. The thing to understand is why the combination of two game states allows the previous player (B) to win exactly when the Grundy numbers are equal.
To do this, we need only consider what effect moves have on the Grundy numbers of the two states.
By definition as the minimum excluded value of successor states, there is always a move that changes the Grundy number of a state to any lower value (ie n could become any number from 0 up to n - 1). There is never a move that leaves the Grundy number the same. There may or may not be moves that increase the Grundy number.
Then, in the case of the combination of two states with the same Grundy number, the player B can win by employing the "copycat strategy". If player A makes a move that decreases the Grundy number of one state, player B can "copy" by reducing the Grundy number of the other state to the same value. If player A makes a move that increases the Grundy number of one state, player B can "undo" it by making a move on the same state to reduce it to the same value it was before. (Our game is finite, so we don't have to worry about an infinite loop of doing and undoing.) These are the only things A can do. (Remember, importantly, there is no move that leaves a Grundy number unchanged.)
If the states don't have the same Grundy number, then the way for the first player to win is clear, then; they just reduces the number of the state with a higher value to match the state with the lower value. This reduces things to the previous scenario.
Here we should note that the minimum excluded value definition allows us to construct the Grundy number for any states recursively in terms of their successors (at least for a finite game). There are no choices, so these numbers are in fact well-defined.
The next question to address is why we can calculate the Grundy number of a combined state. I prefer not to think about XOR at all here. We can define this nim sum operation purely from the minimum excluded value property. We abstractly consider the successors of nim_sum(x, y) to be {nim_sum(k, y) for k in 0..x-1} and {nim_sum(x, k) for k in 0..y-1}; in other words, making a move on one sub-state or the other. (We can ignore successor of one of the sub-states that increase the Grundy number, as such a state would have all the successors of the original state plus nim_sum(x, y) itself as another successor, so it must then have a strictly larger Grundy number. Yes, that's a little bit hand-wavy.) This turns out to be the same as XOR. I don't have a particularly nice explanation for this, but I feel it isn't really necessary to a basic understanding. The important thing is that it is a well-defined operation.

Poker Game Swapping Out Cards For Different Cards

Okay so, I'm onto the next step after dealing the cards to two players.
I need the program to be able to take the player's desired cards it wants to get rid of and exchange them for new random cards. The player will be questioned how many and which cards it wants to exchange. The code should be something like if the player inputs '1' for one throwaway card and then the player has the option to select which card to remove. So that card will then be removed from the hand or list in the code and replaced with 1 new one. This only happens once and then it should print both players' hands.
Every where I look, it's done in a more complicated way and I know it's simple coding but I really do suck at the most simplest things.
What I've got so far:
def poker():
import random
(raw_input('Welcome to a classic game of Poker! You will recieve 5 cards. You will have the option to exchange 1 to 3 cards from your hand for new cards of the same amount you exchanged. IF you have an Ace in your beginning hand, you may exchange that Ace for up to four new cards (three other cards including the ace). ~Press Enter~'))
(raw_input('S = Spades , H = Hearts , C = Clubs , D = Diamonds ~Press Enter~'))
deck = ['2S','2H','2C','2D','3S','3H','3C','3D','4S','4H','4C','4D','5S','5H','5C','5D','6S','6H','6C','6D','7S','7H','7C','7D','8S','8H','8C','8D','9S','9H','9C','9D','10S','10H','10C','10D','Jack(S)','Jack(H)','Jack(C)','Jack(D)','Queen(S)','Queen(H)','Queen(C)','Queen(D)','King(S)','King(H)','King(C)','King(D)', 'Ace(S)','Ace(H)','Ace(C)','Ace(D)']
new_cards = ''
player1 = []
player2 = []
random.shuffle(deck)
for i in range(5): player1.append(deck.pop(0)) and player2.append(deck.pop(0))
print player1
int(input('How many cards would you like to exchange? 1, 2, 3, or 4 IF you have an Ace.'))
#ignore this for now
int(input('Which card would you like to exchange? 1, 2, 3, 4, or 5? Note: The first card in your hand (or list in this case) is the number 1 spot. So if you want to exchange the first card, input 1. The same is for the other cards.'))
The card that was exchanged in the beginning hand also can't be accessible from the deck list after swapping. So like... ['8D','2S','Queen(H),'8S','Jack(H)']
If I wanted to remove 1 card, I choose to remove '2S', '2S' will no longer be in my hand and will be swapped out with a different card from the deck. '2S' will also not return to my hand for any reason because it can't be taken from the list a second time. So the output should be all the same cards EXCEPT the '2S' will be missing and a new card will be in it's place.
There is the standard removing up to 3 cards at once but you can also remove up to 4 cards IF you have an Ace in your beginning hand. But you should be rejected and then asked once more how many cards you want to get rid of if you don't provide an Ace to the question.

What could work is the following :
n_cards_to_exchange = int(input('How many cards would you like to exchange? 1, 2, 3, or 4 IF you have an Ace.'))
for i in range(n_cards_to_exchange):
print(player1)
card_text = ', '.join([str(j) for j in range(1,5-i)]) + f', or {5-i}?'
card_id = int(input(f'Which card would you like to exchange? {card_text} Note: The first card in your hand (or list in this case) is the number 1 spot. So if you want to exchange the first card, input 1. The same is for the other cards.')) - 1
deck.append(player1.pop(card_id))
random.shuffle(deck)
for i in range(n_cards_to_exchange):
player1.append(deck.pop(0))
The idea is that the player chooses the number of cards he wants to drop, and then chooses which cards he want to drop multiple times. Then he draws back cards from the deck. If you need any clarification, feel free to ask.

Josephus algorithm partial succes

My friend told me about Josephus problem, where you have 41 people sitting in the circle. Person number 1 has a sword, kills person on the right and passes the sword to the next person. This goes on until there is only one person left alive. I came up with this solution in python:
print('''There are n people in the circle. You give the knife to one of
them, he stabs person on the right and
gives the knife to the next person. What will be the number of whoever
will be left alive?''')
pplList = []
numOfPeople = int(input('How many people are there in the circle?'))
for i in range(1, (numOfPeople + 1)):
pplList.append(i)
print(pplList)
while len(pplList) > 1:
for i in pplList:
if i % 2 == 0:
del pplList[::i]
print(f'The number of person which survived is {pplList[0]+1}')
break
But it only works up to 42 people. What should I do, or how should I change the code so it would work for, for example, 100, 1000 and more people in the circle?
I've looked up Josephus problem and seen different solutions but I'm curious if my answer could be correct after some minor adjustment or should I start from scratch.

I see two serious bugs.
I guarantee that del ppList[::i] does nothing resembling what you hope it does.
When you wrap around the circle, it is important to know if you killed the last person in the list (first in list kills again) or didn't (first person in list dies).
And contrary to your assertion that it works up to 42, it does not work for many smaller numbers. The first that it doesn't work for is 2. (It gives 3 as an answer instead of 1.)

The problem is you are not considering the guy in the end if he is not killed. Example, if there are 9 people, after killing 8, 9 has the sword, but you are just starting with 1, instead of 9 in the next loop. As someone mentioned already, it is not working for smaller numbers also. Actually if you look close, you're killing odd numbers in the very first loop, instead of even numbers. which is very wrong.
You can correct your code as followed
while len(pplList )>1:
if len(pplList )%2 == 0:
pplList = pplList [::2] #omitting every second number
elif len(pplList )%2 ==1:
last = pplList [-1] #last one won't be killed
pplList = pplList [:-2:2]
pplList .insert(0,last) # adding the last to the start
There are very effective methods to solve the problem other than this method. check this link to know more

Minimax Algorithm Implementation In Python3

I have been trying to build a Tic-Tac-Toe bot in Python. I tried to avoid using the Minimax algorithm, because I was QUITE daunted how to implement it. Until now.
I (finally) wrote an algorithm that sucked and could lose pretty easily, which kinda defeats the purpose of making a computer play Tic-Tac-Toe. So I finally took the courage to TRY to implement the algorithm. I stumbled upon this StackOverflow post. I tried to implement the chosen answer there, but I can't understand most of the stuff. The code in that answer follows:
def minimax(self, player, depth = 0) :
if player == "o":
best = -10
else:
best = 10
if self.complete():
if self.getWinner() == "x": # 'X' is the computer
return -10 + depth, None
elif self.getWinner() == "tie":
return 0, None
elif self.getWinner() == "o" : # 'O' is the human
return 10 - depth, None
for move in self.getAvailableMoves() :
self.makeMove(move, player)
val, _ = self.minimax(self.getEnemyPlayer(player), depth+1)
print(val)
self.makeMove(move, ".")
if player == "o" :
if val > best :
best, bestMove = val, move
else :
if val < best :
best, bestMove = val, move
return best, bestMove
First of all, why are we returning -10 + depth when the computer win and 10 -
depth when the human wins? (I get why we return 0 when it is a draw). Secondly, what is the depth parameter doing? Is there some way to omit it?
Should we omit it?
I'm probably missing something fundamental about the algorithm but, I think I understand it well enough. Please bear in mind that I'm very new to recursive algorithms...
EDIT
So, now I made myself the function:
def minimax(self, player):
won = 10
lost = -10
draw = 0
if self.has_won(HUMAN):
return lost, None
elif self.has_won(BOT):
return won, None
if not(self.board_is_empty()):
return draw, None
moves = self.get_available_moves()
for move in moves:
self.play_move(move[0], move[1], player)
make_board(self.board)
if self.board_is_empty():
val, _ = self.minimax(self.get_enemy_player(player))
self.rewind_move(move)
if val==won:
return val, move
But the problem now is I can't understand what happens when the move ends in a draw or a loss (for the computer). I think what it's doing is that it goes through a move's consequences to see if SOMEONE wins (that's probably what is happening, because I tested it) and then returns that move if SOMEONE wins. How do I modify this code to work properly?
Note:
This function is in a class, hence the self keywords.
moves is a list containing tuples. eg. moves = [(0, 1), (2, 2)] etc. So, moves contains all the empty squares. So each moves[i][j] is an integer modulo 3.
I'm using the exhaustive algorithm suggested by Jacques de Hooge in his answer below.

First note that 10 - depth = - (-10 + depth).
So computer wins have opposite signs from human wins.
In this way they can be added to evaluate the value of a gameboard state.
While with tictactoe this isn't really needed, in a game like chess it is, since it is too timeconsuming to try all possible combinations until checkmate, hence gameboard states have to be evaluated somehow in terms of losses and wins (losing and winning chess pieces, each worth a certain amount of points, values drawn from experience).
Suppose now we only look at 10 - depth (so human wins).
The most attractive wins are the ones that require the least plies (moves).
Since each move or countermove results in the depth being incremented,
more moves will result in parameter depth being larger, so 10 - depth (the "amount" of advantage) being smaller. So quick wins are favored over lenghty ones. 10 is enough, since there are only in total 9 moves possible in a 3 x 3 playfield.
So in short: since tictactoe is so simple, in fact the winning combination can be found in a exhaustive recursive search. But the minimax algorithm is suitable for more complicated situations like chess, in which intermediate situations have to be evaluated in terms of the sum of losses (negative) and gains (positives).
Should the depth parameter be omitted? If you care about the quickest win: No. If you only care about a win (with tictactoe): it can indeed be omitted, since an exhaustive search is possible.
[EDIT]
Exhaustive with tictactoe just means searching 9 plies deep, since the game can never last longer.
Make a recursive function with a parameter player (o or x) and a return value win or loss, that is first decided at the deepest recursion level, and then taken upward through the recursion tree. Let it call itself with the opposite player as parameter for all free fields. A move for the machine is the right one if any sequel results in the machine winning for all branches that the human may take on each level.
Note: The assumption I made is that there IS a winning strategy. If that is not the case (ties possible), the algorithm you have may be the best option. I remember with tictactoe the one who starts the game can always enforce a win in the way described above. So the algorithm will win in at least 50% of all games.
With a non-perfect human player it may also win if the computer doesn't start, if the human player does something suboptimal,

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.