Explanation on Stone Nim Game - python

I was doing a coding problem which I somehow passed all test cases but I did not understand exactly what was going on. The problem was a small twist on the classic nim game:
There are two players A and B. There are N piles of various stones. Each player can take any amount of stones if the pile is less than K, otherwise they must take a multiple of K stones. The last person to take stones wins.
python
# solution -> will A win the game of piles, k?
def solution(piles, k):
gn = 0 # Grundy number
for pile in piles:
if pile % 2 != 0:
gn ^= pile + 1
else:
gn ^= pile - 1
return gn != 0
I'm not sure if there was enough test cases, but k was not even used here. To be honest, I am having a difficult time even understanding what gn (Grundy number) really means. I realize there is a proof of winning the Nim game if the xor of all piles is not zero, but I don't really understand why this variation requires checking the parity of the pile.

First, the given solution is incorrect. You noticed that it does not use k, and indeed this is a big red flag. You can also look at the result it gives for a single pile game, where it seems to say that player A only wins if the size of the pile is one which you should fairly quickly be able to show is incorrect.
The structure of the answer is sort of correct, though. A lot of the power of the Grundy number is that the Grundy number of a combined game state is the nim sum (XOR in the case of finite ordinals) of the Grundy numbers of the individual game states. (This only works for a very specific way of combining game states, but this turns out to be the natural way of considering Nim piles together.) So, this problem can indeed be solved by finding the Grundy number for each pile (considering k) and XOR-ing them together to get the Grundy number for the full game state. (In Nim where you can take any number of stones from a pile and win by taking the last stone, the Grundy number of a pile is just the size of a pile. That's why the solution to that version of Nim just XOR-s the sizes of the piles.)
So, taking the theory for granted, you can solve the problem by finding the correct Grundy values for a single pile given k. You only need to consider one pile games to do this. This is actually a pretty classic problem, and IMO significantly simpler to correctly analyze than multi-pile Nim. You should give it a go.
As for how to think of Grundy numbers, there are plenty of places to read about it, but here's my approach. The thing to understand is why the combination of two game states allows the previous player (B) to win exactly when the Grundy numbers are equal.
To do this, we need only consider what effect moves have on the Grundy numbers of the two states.
By definition as the minimum excluded value of successor states, there is always a move that changes the Grundy number of a state to any lower value (ie n could become any number from 0 up to n - 1). There is never a move that leaves the Grundy number the same. There may or may not be moves that increase the Grundy number.
Then, in the case of the combination of two states with the same Grundy number, the player B can win by employing the "copycat strategy". If player A makes a move that decreases the Grundy number of one state, player B can "copy" by reducing the Grundy number of the other state to the same value. If player A makes a move that increases the Grundy number of one state, player B can "undo" it by making a move on the same state to reduce it to the same value it was before. (Our game is finite, so we don't have to worry about an infinite loop of doing and undoing.) These are the only things A can do. (Remember, importantly, there is no move that leaves a Grundy number unchanged.)
If the states don't have the same Grundy number, then the way for the first player to win is clear, then; they just reduces the number of the state with a higher value to match the state with the lower value. This reduces things to the previous scenario.
Here we should note that the minimum excluded value definition allows us to construct the Grundy number for any states recursively in terms of their successors (at least for a finite game). There are no choices, so these numbers are in fact well-defined.
The next question to address is why we can calculate the Grundy number of a combined state. I prefer not to think about XOR at all here. We can define this nim sum operation purely from the minimum excluded value property. We abstractly consider the successors of nim_sum(x, y) to be {nim_sum(k, y) for k in 0..x-1} and {nim_sum(x, k) for k in 0..y-1}; in other words, making a move on one sub-state or the other. (We can ignore successor of one of the sub-states that increase the Grundy number, as such a state would have all the successors of the original state plus nim_sum(x, y) itself as another successor, so it must then have a strictly larger Grundy number. Yes, that's a little bit hand-wavy.) This turns out to be the same as XOR. I don't have a particularly nice explanation for this, but I feel it isn't really necessary to a basic understanding. The important thing is that it is a well-defined operation.

Related

Is this considered a fitness function for a genetic algorithm?

I start off with a population. I also have properties each individual in the population can have. If an individual DOES have the property, it’s score goes up by 5. If it DOESNT have it, it’s score increases by 0.
Example code using length as a property:
for x in individual:
if len <5:
score += 5
if len >=5:
score += 0
Then I add up the total score and select the individuals I want to continue. Is this a fitness function?
Anything can be a fitness algorithm as long as it gives better points for better DNA. The code you wrote looks like a gene of a DNA rather than a constraint. If it was a constraint, you'd give it a growing score penalty (its a minimization of score?) depending on the distance to the constraint point so that the selection/crossover part could prioritize the closer DNAs to 5 for smaller and distant values to 5 for the bigger. But currently it looks like "anything > 5 works fine" so there will be a lot of random solutions to this with high diversity rather than values like 4.9, 4.99, etc even if you apply elitism.
If there are many variables like "len" with equal score, then one gene's failure could be shadowed by another gene's success. To stop this, you can give them different scores like 5,10,20,40,... so that the selection and crossover can know if it actually made progress without any failure.
If you've meant a constraint by that 5, then you should tell the selection that the "failed" values closer to 5 (i.e. 4,4.5,4.9,4.99) are better than distant ones, by applying a variable score like this:
if(gene < constraint_value)
score += (constraint_value - gene)^2;
// if you've meant to add zero, then don't need to add zero
In comments, you said molecular computations. Molecules have floating point coordinates&masses so if you are optimizing them, then the constraint with variable penalty will make it easier for the selection to get better groups of DNAs for future generations if the mutation is adding onto the current value of genes rather than setting them to a totally random value.

Why is my answer wrong in Code jam 2018 "Saving the World Again"?

The problem is presented here: https://codingcompetitions.withgoogle.com/codejam/round/00000000000000cb/0000000000007966
An alien robot is threatening the universe, using a beam that will destroy all algorithms knowledge. We have to stop it!
Fortunately, we understand how the robot works. It starts off with a beam with a strength of 1, and it will run a program that is a series of instructions, which will be executed one at a time, in left to right order. Each instruction is of one of the following two types:
C (for "charge"): Double the beam's strength.
S (for "shoot"): Shoot the beam, doing damage equal to the beam's current strength.
For example, if the robot's program is SCCSSC, the robot will do the following when the program runs:
Shoot the beam, doing 1 damage.
Charge the beam, doubling the beam's strength to 2.
Charge the beam, doubling the beam's strength to 4.
Shoot the beam, doing 4 damage.
Shoot the beam, doing 4 damage.
Charge the beam, increasing the beam's strength to 8.
In that case, the program would do a total of 9 damage.
The universe's top algorithmists have developed a shield that can withstand a maximum total of D damage. But the robot's current program might do more damage than that when it runs.
The President of the Universe has volunteered to fly into space to hack the robot's program before the robot runs it. The only way the President can hack (without the robot noticing) is by swapping two adjacent instructions. For example, the President could hack the above program once by swapping the third and fourth instructions to make it SCSCSC. This would reduce the total damage to 7. Then, for example, the president could hack the program again to make it SCSSCC, reducing the damage to 5, and so on.
To prevent the robot from getting too suspicious, the President does not want to hack too many times. What is this smallest possible number of hacks which will ensure that the program does no more than D total damage, if it is possible to do so?
Input
The first line of the input gives the number of test cases, T. T test cases follow. Each consists of one line containing an integer D and a string P: the maximum total damage our shield can withstand, and the robot's program.
Output
For each test case, output one line containing Case #x: y, where x is the test case number (starting from 1) and y is either the minimum number of hacks needed to accomplish the goal, or IMPOSSIBLE if it is not possible.
I implemented the following logic:
- First calculate the damage of the ship.
- The S has it's greatest value when it is at the end so the swaps should start at end and continue towards the beginning of the list.
- The C at the end becomes useless so I pop it out of the list so it does not iterate over it again.
- In order to simplify the O() complexity I decided to subtract the last value of S from theSUM every time a swap is made.
The test results seem right - but the judge of the system says : Wrong Answer.
Can you help me find the mistake?
(I know only how to operate with lists and dictionaries in Python 3 and I am an absolute beginner at solving theese questions )
my code is below:
for case in range(1,T):
D, B = input().split()
D = int(D)
Blist =[]
[Blist.append(i) for i in B]
def beamDamage(Blist):
theSum=0
intS=1
Ccount = 0
for i in Blist:
if i == 'S':
theSum = theSum + intS
if i == 'C':
Ccount = Ccount +1
intS = intS*2
return theSum
def swap(Blist):
temp=''
for i in range(0,len(Blist)):
if Blist[len(Blist)- 1] == 'C':
Blist.pop()
if (Blist[len(Blist)- i - 1]) == 'C' and (Blist[len(Blist)- i] == 'S'):
temp = Blist[len(Blist)- i - 1] # C
Blist[len(Blist)- i - 1] = 'S'
Blist[len(Blist)- i] = temp
return Blist
bd = beamDamage(Blist)
y = 0
if 'C' not in B:
if beamDamage(Blist) > D:
print("Case #{}: IMPOSSIBLE".format(case))
else:
print("Case #{}: 0".format(case))
else:
while bd > D:
swap(Blist)
pwr=0
for ch in Blist:
if ch == 'C':
pwr=pwr+1
bd = bd - 2**(pwr-1)
y+=1
print("Case #{}: {}".format(case, y))
I will not give you a complete solution, but here is one issue:
If your input is a series of "S" followed by one or more "C" (like "SSSSC"), and the calculated damage is higher than asked for, you'll clearly see that the result is wrong. It should be IMPOSSIBLE...
The reason for the failure is that the condition in if 'C' not in B: will not apply, and so the loop will kick in (when it really shouldn't). Consequently pwr remains zero and you use a calculation with 2**-1, which yields a non-integer value.
The solution is to trim the list from terminating C characters at the very start, even before doing the if test.
Secondly, I don't see the benefit of doing the damage calculation in two different ways. On the one hand you have beamDamage, and you also have the inline loop, which does roughly the same (not faster).
Finally, even if you get this right, I suspect your code might run into a timeout, because it is not doing the job efficiently. Think of keeping track of the damage incrementally, without needing to go through the whole list again.
Once you have that improvement, you may still need to tune performance furhter. In that case, think of what damage reduction you would get it you would move a "C" immediately to the very end of the list. If that reduction is still not bringing the damage below the target, you can go for that in one go (but still count the steps correctly).

Which part of my code is causing a "Time-Out Error" and why?

I am trying to solve this problem on Hackerrank:https://www.hackerrank.com/challenges/climbing-the-leaderboard. The problem statement basically states that there are two set of scores one of players and other of Alice and we have to use dense ranking and display Alice's rank when compared to other player's scores. It is giving me Time-Out error on large test-cases. I have used the forum suggestions on Hackerrank already and was successful, but specifically I am curious to know the problem in my code. Here is my code:
class Dict(dict):
def __init__(self):
self=dict()
def add(self,key,value):
self[key]=value
def climbingLeaderboard(scores, alice):
alice_rank=[]
for i in range(len(alice)):
scores.append(alice[i])
a=list(set(scores))
a.sort(reverse=True)
obj=Dict()
b=1
for j in a:
obj.add(j,b)
b+=1
if alice[i] in obj:
alice_rank.append(obj[alice[i]])
scores.remove(alice[i])
return alice_rank
You have a couple of problems in your code but the most important one is the following.
...
scores.append(alice[i])
a=list(set(scores))
a.sort(reverse=True)
...
On each iteration you add Alice's score to scores and then sort scores. The cost here is already O(nlog(n)), where n - number of elements in scores. Thus, your total time complexity becomes O(n*n*log(n)). That's too much because n can reach 200000 and so for your solution it can be up to 200000*200000*log(200000) operations.
Of course, there's another problem:
...
for j in a:
obj.add(j,b)
b+=1
...
But it's still not as bad as the previous one since the loop time complexity is O(n).
There exists a O(n*log(n)) time complexity solution. I'll give you an overall idea so that you can easily implement it yourself.
If you recall that players with duplicate scores share the same position in the leaderboard then you can convert your scores to an array without duplicates as list(set(scores)) before your loop. In that case, the first position corresponds to the highest score, the second one to the second highest score and so on (the initial array is sorted in decreasing order per problem statement).
Given the step above, for each score of Alice you can find a position in the array at which the player's score is less or equal to score. The lookup will take O(log(n)) because the array is sorted. For instance, if scores of players are 40, 30, 10 and a score of Alice is 35, then the found position will be 2 (for the algorithm description I consider that the first index starts from 1) as 30 occupies this position, This position is the ACTUAL position of Alice in the leaderboard and so can be printed right away.
Another tip - you can use bisect module for performing a binary search in the array.
So, the overall time complexity of the proposed solution is O(n*log(n)). It will pass all the test cases (I've tried it).
Performing repeated sort (a.sort(reverse=True)) consumes a lot of time. I had the same problem. If you read the question, you will find that the scores are input in sorted (ascending or descending). The trick is to exploit this inherent ordering of input.
One more thing, your code's time complexity is O(n^2) due to nested loop, whereas the forum you spoke if may be doing it with O(n) (not sure).

Weighted Traversal Algorithm (Breadth first is better?)

I'm having trouble designing an algorithm for a traversal problem.
I have a Ship that I control on a 2D grid and it starts on the very bottom of the grid. Each tile of the grid has a value (between 0 and 1000) equal to how much 'resource' is in that tile.
The Ship can go_left(), go_up(), go_right() or stay_still()
If the ship stay_still() it collects 25% of it's current tile's resource (rounded up to the nearest int).
If the ship uses a move command, it needs to spend 10% of it's current tile resource value rounded down. Moves that cost more than the ship has collected are illegal. (So if a ship is on a 100, it costs 10 to move off the 100, if it's on a 9 or less, moving is free).
The goal is to find a relatively short path that legally collects 1000 resource. Returning a list of the move order to corresponds to the path.
I naturally tried a recursive approach:
In sudo-code the algorithm is:
alg(position, collected, best_path):
if ship has 1000:
return best_path
alg(stay still)
if ship has enough to move:
alg(try left)
alg(try up)
alg(try right)
If you want a closer look at the actual syntax in python3 here it is:
def get_path_to_1000(self, current_position, collected_resource, path, game_map):
if collected_resource >= 1000:
return path
path_stay = path.copy().append(stay_still())
self.get_path_to_1000(current_position, collected_resource +
math.ceil(0.25 * game_map[current_position].value),
path_stay, game_map.copy().collect(current_position))
cost = math.floor(0.1 * game_map[current_position].value)
if collected_resource >= cost:
direction_list = [Direction.West, Direction.North, Direction.East]
move_list = [go_left(), go_up(), go_right()]
for i in range(3):
new_path = path.copy().append(move_list[i])
self.get_path_to_1000(
current_position.offset(direction_list[i]),
collected_resource - cost, new_path, game_map)
The problem with my approach is that the algorithm never completes because it keeps trying longer and longer lists of the ship staying still.
How can I alter my algorithm so it actually tries more than one option, returning a relatively short (or shortest) path to 1000?
The nature of this problem (ignoring the exact mechanics of the rounding down/variable cost of moving) is to find the shortest number of nodes in order to acquire 1,000 resources. Another way to look at this goal is that the ship is trying to find the most efficient move with each turn.
This issue can be solved with a slightly modified version of Dijksta's algorithm. Instead of greedily choosing the move with the least weight, we will choose the move with the most weight (greatest number of resources), and add this value to a running counter that will make sure that we reach 1000 resources total. By greedily adding the most efficient edge weights (while below 1000), we'll find the least number of moves to get a total 1000 resources.
Simply keep a list of the moves made with the algorithm and return that list when the resource counter reaches 1000.
Here's a helpful resource on how to best implement Dijkstra's algorithm:
https://www.geeksforgeeks.org/dijkstras-shortest-path-algorithm-greedy-algo-7/
With the few modifications, it should be your best bet!

Minimax Algorithm Implementation In Python3

I have been trying to build a Tic-Tac-Toe bot in Python. I tried to avoid using the Minimax algorithm, because I was QUITE daunted how to implement it. Until now.
I (finally) wrote an algorithm that sucked and could lose pretty easily, which kinda defeats the purpose of making a computer play Tic-Tac-Toe. So I finally took the courage to TRY to implement the algorithm. I stumbled upon this StackOverflow post. I tried to implement the chosen answer there, but I can't understand most of the stuff. The code in that answer follows:
def minimax(self, player, depth = 0) :
if player == "o":
best = -10
else:
best = 10
if self.complete():
if self.getWinner() == "x": # 'X' is the computer
return -10 + depth, None
elif self.getWinner() == "tie":
return 0, None
elif self.getWinner() == "o" : # 'O' is the human
return 10 - depth, None
for move in self.getAvailableMoves() :
self.makeMove(move, player)
val, _ = self.minimax(self.getEnemyPlayer(player), depth+1)
print(val)
self.makeMove(move, ".")
if player == "o" :
if val > best :
best, bestMove = val, move
else :
if val < best :
best, bestMove = val, move
return best, bestMove
First of all, why are we returning -10 + depth when the computer win and 10 -
depth when the human wins? (I get why we return 0 when it is a draw). Secondly, what is the depth parameter doing? Is there some way to omit it?
Should we omit it?
I'm probably missing something fundamental about the algorithm but, I think I understand it well enough. Please bear in mind that I'm very new to recursive algorithms...
EDIT
So, now I made myself the function:
def minimax(self, player):
won = 10
lost = -10
draw = 0
if self.has_won(HUMAN):
return lost, None
elif self.has_won(BOT):
return won, None
if not(self.board_is_empty()):
return draw, None
moves = self.get_available_moves()
for move in moves:
self.play_move(move[0], move[1], player)
make_board(self.board)
if self.board_is_empty():
val, _ = self.minimax(self.get_enemy_player(player))
self.rewind_move(move)
if val==won:
return val, move
But the problem now is I can't understand what happens when the move ends in a draw or a loss (for the computer). I think what it's doing is that it goes through a move's consequences to see if SOMEONE wins (that's probably what is happening, because I tested it) and then returns that move if SOMEONE wins. How do I modify this code to work properly?
Note:
This function is in a class, hence the self keywords.
moves is a list containing tuples. eg. moves = [(0, 1), (2, 2)] etc. So, moves contains all the empty squares. So each moves[i][j] is an integer modulo 3.
I'm using the exhaustive algorithm suggested by Jacques de Hooge in his answer below.
First note that 10 - depth = - (-10 + depth).
So computer wins have opposite signs from human wins.
In this way they can be added to evaluate the value of a gameboard state.
While with tictactoe this isn't really needed, in a game like chess it is, since it is too timeconsuming to try all possible combinations until checkmate, hence gameboard states have to be evaluated somehow in terms of losses and wins (losing and winning chess pieces, each worth a certain amount of points, values drawn from experience).
Suppose now we only look at 10 - depth (so human wins).
The most attractive wins are the ones that require the least plies (moves).
Since each move or countermove results in the depth being incremented,
more moves will result in parameter depth being larger, so 10 - depth (the "amount" of advantage) being smaller. So quick wins are favored over lenghty ones. 10 is enough, since there are only in total 9 moves possible in a 3 x 3 playfield.
So in short: since tictactoe is so simple, in fact the winning combination can be found in a exhaustive recursive search. But the minimax algorithm is suitable for more complicated situations like chess, in which intermediate situations have to be evaluated in terms of the sum of losses (negative) and gains (positives).
Should the depth parameter be omitted? If you care about the quickest win: No. If you only care about a win (with tictactoe): it can indeed be omitted, since an exhaustive search is possible.
[EDIT]
Exhaustive with tictactoe just means searching 9 plies deep, since the game can never last longer.
Make a recursive function with a parameter player (o or x) and a return value win or loss, that is first decided at the deepest recursion level, and then taken upward through the recursion tree. Let it call itself with the opposite player as parameter for all free fields. A move for the machine is the right one if any sequel results in the machine winning for all branches that the human may take on each level.
Note: The assumption I made is that there IS a winning strategy. If that is not the case (ties possible), the algorithm you have may be the best option. I remember with tictactoe the one who starts the game can always enforce a win in the way described above. So the algorithm will win in at least 50% of all games.
With a non-perfect human player it may also win if the computer doesn't start, if the human player does something suboptimal,

Categories

Resources