Python Dijkstra Algorithm Memory Error? - python

I have copied and pasted this answer for Dijkstra Algorithm to my project. It seemed ok after several simple tests.
In my specific implementation, I need the algorithm to return a list of nodes. So I have to modify the original code so that it always returns a list. More specifically, I removed all the return "string" lines there. The modified code by me is as follows:
## using Dijkstra Algorithm ##
def choosePath(s, t):
net = {'0':{'1':138, '9':150},
'1':{'0':138, '2':178, '8':194},
'2':{'1':178, '3':47.5},
'3':{'2':47.5, '4':70},
'4':{'3':70, '5':70},
'5':{'4':70, '6':36},
'6':{'5':36, '7':50},
'7':{'6':50, '8':81},
'8':{'7':81, '9':138, '1':194},
'9':{'8':138, '0':150}}
# sanity check
if s == t:
return []
# create a labels dictionary
labels={}
# record whether a label was updated
order={}
# populate an initial labels dictionary
for i in net.keys():
if i == s: labels[i] = 0 # shortest distance form s to s is 0
else: labels[i] = float("inf") # initial labels are infinity
from copy import copy
drop1 = copy(labels) # used for looping
## begin algorithm
while len(drop1) > 0:
# find the key with the lowest label
minNode = min(drop1, key = drop1.get) #minNode is the node with the smallest label
# update labels for nodes that are connected to minNode
for i in net[minNode]:
if labels[i] > (labels[minNode] + net[minNode][i]):
labels[i] = labels[minNode] + net[minNode][i]
drop1[i] = labels[minNode] + net[minNode][i]
order[i] = minNode
del drop1[minNode] # once a node has been visited, it's excluded from drop1
## end algorithm
# print shortest path
temp = copy(t)
rpath = []
path = []
while 1:
rpath.append(temp)
if order.has_key(temp):
temp = order[temp]
if temp == s:
rpath.append(temp)
break
for j in range(len(rpath)-1,-1,-1):
path.append(rpath[j])
return [junctions[int(elem)] for elem in path]
Then when I run it, I end up with the following error:
>>> Traceback (most recent call last):
File "C:\Users\...\simulation.py", line 162, in choosePath
rpath.append(temp)
MemoryError
Obviously, it is because I have removed the return "string" lines. However, I failed to find out which deletion makes it die. Why is it so?
How may I make it work again AND always returns a list instead of a string as I wish?

I suspect your problem is that you're passing the wrong arguments to the function. You want to call choosePath('0', '9'). Strings. Not integers.
What's comical is that if ANY of the parts of the program you removed were still there, it would have caught this and stopped the program. With this part, it catches if your input is wrong.
if net.has_key(s)==False:
return "There is no start node called " + str(s) + "."
if net.has_key(t)==False:
return "There is no terminal node called " + str(t) + "."
With this part, it catches if it never reaches a solution.
else: return "There is no path from " + str(s) + " to " + str(t) + "."
The sanity checks are not strictly necessary, since as you mentioned a path is assured in your net. Still the checks are nice because if you ever do choose to change things around you'll know the computer will call you out on obvious mistakes. One option is to replace them with exceptions, since none of these messages should really come up unless something has gone horribly wrong. That's what I opted for in the following code.
class NoPathException(Exception):
pass
def choosePath(s, t):
net = {'0':{'1':138, '9':150},
'1':{'0':138, '2':178, '8':194},
'2':{'1':178, '3':47.5},
'3':{'2':47.5, '4':70},
'4':{'3':70, '5':70},
'5':{'4':70, '6':36},
'6':{'5':36, '7':50},
'7':{'6':50, '8':81},
'8':{'7':81, '9':138, '1':194},
'9':{'8':138, '0':150}}
# sanity check
if s == t:
return []
if not net.has_key(s):
raise ValueError("start node argument not in net")
if not net.has_key(t):
raise ValueError("end node argument not in net")
# create a labels dictionary
labels={}
# record whether a label was updated
order={}
# populate an initial labels dictionary
for i in net.keys():
if i == s: labels[i] = 0 # shortest distance form s to s is 0
else: labels[i] = float("inf") # initial labels are infinity
from copy import copy
drop1 = copy(labels) # used for looping
## begin algorithm
while len(drop1) > 0:
# find the key with the lowest label
minNode = min(drop1, key = drop1.get) #minNode is the nod2 with the smallest label
# update labels for nodes that are connected to minNode
for i in net[minNode]:
if labels[i] > (labels[minNode] + net[minNode][i]):
labels[i] = labels[minNode] + net[minNode][i]
drop1[i] = labels[minNode] + net[minNode][i]
order[i] = minNode
del drop1[minNode] # once a node has been visited, it's excluded from drop1
## end algorithm
# print shortest path
temp = copy(t)
rpath = []
path = []
while 1:
rpath.append(temp)
if order.has_key(temp):
temp = order[temp]
else:
raise NoPathException("no path to solution")
if temp == s:
rpath.append(temp)
break
for j in range(len(rpath)-1,-1,-1):
path.append(rpath[j])
return path
Testing
a = choosePath('3', '9')
print(a)
['3', '4', '5', '6', '7', '8', '9']
Is this the output you're looking for?

Related

comparison Algorithm for two hashes

As an initial situation, I have a sha1 hash value. I want to compare this with a file full of hash values to see if the sha1 hash value is contained in the file with the hash values.
So more exactly:
f1=sha1 #value read in
fobj = open("Hashvalues.txt", "r") #open file with hash values
for f1 in fobj:
print ("hash value found")
else:
print("HashValue not found")
fobj.close()
The file is very large (11.1GB)
Is there a useful algorithm to perform the search as fast as possible? The hash values in the hash file are ordered by hashes.
I think comparing this line by line won't be the fastest way, will it?
EDIT:
I changed my Code as follows:
f1="9bc34549d565d9505b287de0cd20ac77be1d3f2c" #value read in
with open("pwned-passwords-sha1-ordered-by-hash-v5.txt") as f:
lineList = [line.rstrip('\n\r') for line in open("pwned-passwords-sha1-
ordered-by-hash-v5.txt")]
def binarySearch(arr, l, r, x):
while l <= r:
mid = l + (r - l)/2;
# Check if x is present at mid
if arr[mid] == x:
return mid
# If x is greater, ignore left half
elif arr[mid] < x:
l = mid + 1
# If x is smaller, ignore right half
else:
r = mid - 1
# If we reach here, then the element
# was not present
return -1
# Test array
arr = lineList
x = "9bc34549d565d9505b287de0cd20ac77be1d3f2c" #value read in
# Function call
result = binarySearch(arr, 0, len(arr)-1, x)
if result != -1:
print "Element is present at index % d" % result
else:
print "Element is not present in array"
But it doesn't work as fast as i thought. Is my Implementation correct?
EDIT2:
def binarySearch (l, r, x):
# Check base case
if r >= l:
mid = l + (r - l)/2
# If element is present at the middle itself
if getLineFromFile(mid) == x:
return mid
# If element is smaller than mid, then it
# can only be present in left subarray
elif getLineFromFile(mid) > x:
return binarySearch(l, mid-1, x)
# Else the element can only be present
# in right subarray
else:
return binarySearch(mid + 1, r, x)
else:
# Element is not present in the array
return -1
x = '0000000A0E3B9F25FF41DE4B5AC238C2D545C7A8:15'
def getLineFromFile(lineNumber):
with open('testfile.txt') as f:
for i, line in enumerate(f):
if i == lineNumber:
return line
else:
print('Not 7 lines in file')
line = None
# get last Element of List
def tail():
for line in open('pwned.txt', 'r'):
pass
else:
print line
ausgabetail = tail()
#print ausgabetail
result = binarySearch( 0, ausgabetail, x)
if result != -1:
print "Element is present at index % d" % result
else:
print "Element is not present in array"
My problem now is to get the correct index for the right side for the binary search. I pass the function (l, r, x). The left side starts at the beginning with 0. The right side should be the end of the file so the last line. I try to get that but it doesn't work. I tried to get this with the Funktion tail(). But if I print r on testing, I get the value "None".
Do you have another idea here?
Looking at the code I see that you are still reading all the lines from the file, this indeed is the bottleneck.
It's not the binary search.
Assuming that the hashes are sorted
You can just read the number of lines in the file.
Then just perform the binary search. Instead of reading the entire file u can use Seek to reach a particular line in the file that way you will only be reading log(n) number of lines. That should increase the speed.
Example
def binarySearch(l, r, x):
....
#change arr[mid] with getLineFromFile(mid)
....
def getLineFromFile(lineNumber):
with open('xxx.txt') as f:
for i, line in enumerate(f):
if i == lineNumber:
return line
else:
print('Not 7 lines in file')
line = None

Need help understanding USACO solution

I was trying to solve the Broken Necklace problem from USACO and I came across this solution. The problem statement is here: https://train.usaco.org/usacoprob2?S=beads&a=c3sjno1crwH
I am confused why the person who wrote this solution made 3 copies of the initial string, and basically the entire for loop.
I have tried looking for other solutions online that might explain it better, but there is a small number of python solutions to this problem and many of them are completely different.
'''
ID: krishpa2
LANG: PYTHON3
TASK: beads
'''
with open('beads.in','r') as fin:
N = int(fin.readline())
beads = fin.readline()[:-1]
def canCollect(s):
return not ('r' in s and 'b' in s)
beads = beads*3
max = 0
for p in range(N, N*2):
i = p-1
left = []
while i > 0:
if canCollect(left + [beads[i]]):
left.append(beads[i])
i -= 1
else:
break
i = p
right = []
while i < 3*N - 1:
if canCollect(right + [beads[i]]):
right.append(beads[i])
i+=1
else:
break
result = len(left) + len(right)
if result >= N:
max = N
break
elif result > max:
max = result
print(max)
with open('beads.out','w') as fout:
fout.write(str(max) + '\n')
The program is correctly working, I just wanted to know why.
I know that this question is pretty old, but I still want to answer it for future people, so I have made a fully commented version below (based on this answer by joshjq91) -
"""
PROG: beads
LANG: PYTHON3
#FILE
"""
# Original file from Github by joshjq91
# (https://github.com/jadeGeist/USACO/blob/master/1.2.4-beads.py)
# Comments by Ayush
with open('beads.in','r') as filein:
N = int(filein.readline()) # Number of beads
beads = filein.readline()[:-1] # Necklace
def canCollect(s):
return not ('r' in s and 'b' in s) # If r and b are not in the same str,
# then you can collect the string.
beads = beads*3 # Wraparound - r actually can be shown as r r r (wraparound
# for the front and back)
max = 0 # The final result
for p in range(N, N*2): # Loop through the 2nd bead string (so you can use
i = p-1 # wraparounds for the front and back)
left = []
while i > 0: # Check if you can collect beads (left)
if canCollect(left + [beads[i]]): # Can colleect
left.append(beads[i]) # Add to left
i -= 1 # Loop through again
else:
break # Cannot collect more beads - break
i = p # You will otherwise have a duplicate bead (left is i=p-1)
right = []
while i < 3*N - 1: # Check if you can collect beads (right) - i has
#print("righti",i-N) # to be less than 3*N - 1 b/c that is the length
# ^ for testing # of the beads + runarounds.
if canCollect(right + [beads[i]]): # Can collect
right.append(beads[i]) # Add to right
i+=1 # Loop through again
else:
break # Cannot collect more beads - break
result = len(left) + len(right) # Final result
if result >= N: # The result was greater than N means that the whole
max = N # necklace is the same (EX: rwr)
break # Break - we now know we don't need to go through again b/c the
# whole string is the same!
elif result > max: # The result makes sense
max = result
with open('beads.out','w') as fileout:
fileout.write(str(max) + '\n') # Final result

What's wrong with my recursion implementation?

I have recently started learning programming, just completed a course on edX. I was trying to solve this problem on HackerRank and it is running out of time in each case. What am I doing wrong?
n,k = input().strip().split(' ')
n,k = [int(n),int(k)]
x = [int(x_temp) for x_temp in input().strip().split(' ')]
x.sort()
def transmitter(aList=[], target=0):
'''
accepts a list of house location, and a target location for the transmitter
returns the optimal number of transmitters required to cover all the houses
'''
List = aList[:]
start = target - k
end = target + k + 1
for i in range(start, end):
if i in List:
List.remove(i)
if not List:
return 1
m = max(List)
for e in List:
if transmitter(List, e) < m:
m = transmitter(List, e)
return 1 + m
m = max(x)
for e in x:
if transmitter(x, e) < m:
m = transmitter(x, e)
print(m)
I am pretty new to this. Sorry for making any obvious mistakes, or for posting this here in case this is not the suitable site. In that case, it will be really helpful if you can recommend a site where I can ask such question.
the screenshot of the question
I'm pretty sure a greedy algorithm solves this problem optimally in just O(N) time. There's not need for any recursion. Just place each transmitter in turn as far to the right as you can without leaving any houses to its left uncovered. Stop when the last house is covered.
Here's how I'd code that:
def hackerland(houses, k): # houses should be sorted list of locations
first = None # location of first uncovered house
last = 0 # last location covered by a previous transmitter
prev = None
count = 0 # transmitters = []
for x in houses:
if first is not None and x > first + k:
first = None
count += 1 # transmitters.append(prev)
last = prev + k
if last is not None and x > last:
last = None
first = x
prev = x
if first is not None:
count += 1 # transmitters.append(prev)
return count # return transmitters
I've included comments that show how this code could be easily modified to return a list of the transmitter locations, rather than just a count of how many are needed.
It is not necessary to take a recursive approach. In fact, you can just work forward, iterate over the houses, placing transmitters when the previously placed one does not reach far enough to cover the current house, etc.
It is a bit more complicated than that, but not much. See this code:
# input
n,k = input().strip().split(' ')
n,k = [int(n),int(k)]
x = [int(x_temp) for x_temp in input().strip().split(' ')]
# eliminate duplicate house x-xoordinates, they don't influence the result
houses = list(set(x))
houses.sort()
# add extreme far dummy house (will make the loop easier)
houses.append(100000)
reachedX = 0 # coordinate until where the previously placed transmitter reaches
unreachedX = -1 # coordinate that the next one needs to cover (to the left)
lastHouseId = -1 # index where previous transmitter was placed
transmitters = [] # coordinates of the placed transmitters
for houseId, houseX in enumerate(houses):
if reachedX > unreachedX: # we might still be in range of last transmitter
if houseX > reachedX: # we just went out of reach
unreachedX = houseX # this house must be covered by next one
elif houseX - k > unreachedX: # transmitter here wouldn't reach far enough back
lastHouseId = houseId - 1 # place it on previous house
reachedX = houses[lastHouseId] + k
transmitters.append(houses[lastHouseId])
print(transmitters)
print(len(transmitters))

Python Dynamic Knapsack

Right now I am attempting to code the knapsack problem in Python 3.2. I am trying to do this dynamically with a matrix. The algorithm that I am trying to use is as follows
Implements the memoryfunction method for the knapsack problem
Input: A nonnegative integer i indicating the number of the first
items being considered and a nonnegative integer j indicating the knapsack's capacity
Output: The value of an optimal feasible subset of the first i items
Note: Uses as global variables input arrays Weights[1..n], Values[1...n]
and table V[0...n, 0...W] whose entries are initialized with -1's except for
row 0 and column 0 initialized with 0's
if V[i, j] < 0
if j < Weights[i]
value <-- MFKnapsack(i - 1, j)
else
value <-- max(MFKnapsack(i -1, j),
Values[i] + MFKnapsack(i -1, j - Weights[i]))
V[i, j} <-- value
return V[i, j]
If you run the code below that I have you can see that it tries to insert the weight into the the list. Since this is using the recursion I am having a hard time spotting the problem. Also I get the error: can not add an integer with a list using the '+'. I have the matrix initialized to start with all 0's for the first row and first column everything else is initialized to -1. Any help will be much appreciated.
#Knapsack Problem
def knapsack(weight,value,capacity):
weight.insert(0,0)
value.insert(0,0)
print("Weights: ",weight)
print("Values: ",value)
capacityJ = capacity+1
## ------ initialize matrix F ---- ##
dimension = len(weight)+1
F = [[-1]*capacityJ]*dimension
#first column zeroed
for i in range(dimension):
F[i][0] = 0
#first row zeroed
F[0] = [0]*capacityJ
#-------------------------------- ##
d_index = dimension-2
print(matrixFormat(F))
return recKnap(F,weight,value,d_index,capacity)
def recKnap(matrix, weight,value,index, capacity):
print("index:",index,"capacity:",capacity)
if matrix[index][capacity] < 0:
if capacity < weight[index]:
value = recKnap(matrix,weight,value,index-1,capacity)
else:
value = max(recKnap(matrix,weight,value,index-1,capacity),
value[index] +
recKnap(matrix,weight,value,index-1,capacity-(weight[index]))
matrix[index][capacity] = value
print("matrix:",matrix)
return matrix[index][capacity]
def matrixFormat(*doubleLst):
matrix = str(list(doubleLst)[0])
length = len(matrix)-1
temp = '|'
currChar = ''
nextChar = ''
i = 0
while i < length:
if matrix[i] == ']':
temp = temp + '|\n|'
#double digit
elif matrix[i].isdigit() and matrix[i+1].isdigit():
temp = temp + (matrix[i]+matrix[i+1]).center(4)
i = i+2
continue
#negative double digit
elif matrix[i] == '-' and matrix[i+1].isdigit() and matrix[i+2].isdigit():
temp = temp + (matrix[i]+matrix[i+1]+matrix[i+2]).center(4)
i = i + 2
continue
#negative single digit
elif matrix[i] == '-' and matrix[i+1].isdigit():
temp = temp + (matrix[i]+matrix[i+1]).center(4)
i = i + 2
continue
elif matrix[i].isdigit():
temp = temp + matrix[i].center(4)
#updates next round
currChar = matrix[i]
nextChar = matrix[i+1]
i = i + 1
return temp[:-1]
def main():
print("Knapsack Program")
#num = input("Enter the weights you have for objects you would like to have:")
#weightlst = []
#valuelst = []
## for i in range(int(num)):
## value , weight = eval(input("What is the " + str(i) + " object value, weight you wish to put in the knapsack? ex. 2,3: "))
## weightlst.append(weight)
## valuelst.append(value)
weightLst = [2,1,3,2]
valueLst = [12,10,20,15]
capacity = 5
value = knapsack(weightLst,valueLst,5)
print("\n Max Matrix")
print(matrixFormat(value))
main()
F = [[-1]*capacityJ]*dimension
does not properly initialize the matrix. [-1]*capacityJ is fine, but [...]*dimension creates dimension references to the exact same list. So modifying one list modifies them all.
Try instead
F = [[-1]*capacityJ for _ in range(dimension)]
This is a common Python pitfall. See this post for more explanation.
for the purpose of cache illustration, I generally use a default dict as follows:
from collections import defaultdict
CS = defaultdict(lambda: defaultdict(int)) #if i want to make default vals as 0
###or
CACHE_1 = defaultdict(lambda: defaultdict(lambda: int(-1))) #if i want to make default vals as -1 (or something else)
This keeps me from making the 2d arrays in python on the fly...
To see an answer to z1knapsack using this approach:
http://ideone.com/fUKZmq
def zeroes(n,m):
v=[['-' for i in range(0,n)]for j in range(0,m)]
return v
value=[0,12,10,20,15]
w=[0,2,1,3,2]
v=zeroes(6,5)
def knap(i,j):
global v
if i==0 or j==0:
v[i][j]= 0
elif j<w[i] :
v[i][j]=knap(i-1,j)
else:
v[i][j]=max(knap(i-1,j),value[i]+knap(i-1,j-w[i]))
return v[i][j]
x=knap(4,5)
print (x)
for i in range (0,len(v)):
for j in range(0,len(v[0])):
print(v[i][j],end="\t\t")
print()
print()
#now these calls are for filling all the boxes in the matrix as in the above call only few v[i][j]were called and returned
knap(4,1)
knap(4,2)
knap(4,3)
knap(4,4)
for i in range (0,len(v)):
for j in range(0,len(v[0])):
print(v[i][j],end="\t\t")
print()
print()

Python Dijkstra Algorithm

I am trying to write Dijkstra's Algorithm, however I am struggling on how to 'say' certain things in code.
To visualize, here are the columns I want represented using arrays:
max_nodes
A B C Length Predecessor Visited/Unvisited
A 0 1 2 -1 U
B 1 0 1 -1 U
C 2 1 0 -1 U
So, there will be several arrays, as seen in my code below:
def dijkstra (graph, start, end)
network[max_nodes][max_nodes]
state [max_nodes][length]
state2 [max_nodes][predecessor]
state3 [max_nodes][visited]
initialNode = 0
for nodes in graph:
D[max_nodes][length] = -1
P[max_nodes][predecessor] = ""
V[max_nodes][visited] = false
for l in graph:
length = lengthFromSource[node] + graph[node][l]
if length < lengthFromSourceNode[w]:
state[l][length] = x
state2[l][predecessor]
state3[l][visited] = true
x +=1
The part in bold is where I am stuck on - I am trying to implement this section of the algorithm:
3. For current node, consider all its unvisited neighbors and calculate their tentative distance. For example, if current node (A) has distance of 6, and an edge connecting it with another node (B) is 2, the distance to B through A will be 6+2=8. If this distance is less than the previously recorded distance, overwrite the distance
4. When we are done considering all neighbors of the current node, mark it as visited. A visited node will not be checked ever again; its distance recorded now is final and minimal
I think I am on the right track, i'm just stuck on how to say 'start at a node, get the length from source to a node, if length is smaller, overwrite previous value, then move to next node
I also used a dictionary to store the network.
Data is in the following format:
source: {destination: cost}
create a network dictionary (user provided)
net = {'0':{'1':100, '2':300},
'1':{'3':500, '4':500, '5':100},
'2':{'4':100, '5':100},
'3':{'5':20},
'4':{'5':20},
'5':{}
}
shortest path algorithm (user needs to specify start and terminal nodes)
def dijkstra(net, s, t):
# sanity check
if s == t:
return "The start and terminal nodes are the same. Minimum distance is 0."
if s not in net: # python2: if net.has_key(s)==False:
return "There is no start node called " + str(s) + "."
if t not in net: # python2: if net.has_key(t)==False:
return "There is no terminal node called " + str(t) + "."
# create a labels dictionary
labels={}
# record whether a label was updated
order={}
# populate an initial labels dictionary
for i in net.keys():
if i == s: labels[i] = 0 # shortest distance form s to s is 0
else: labels[i] = float("inf") # initial labels are infinity
from copy import copy
drop1 = copy(labels) # used for looping
## begin algorithm
while len(drop1) > 0:
# find the key with the lowest label
minNode = min(drop1, key = drop1.get) #minNode is the node with the smallest label
# update labels for nodes that are connected to minNode
for i in net[minNode]:
if labels[i] > (labels[minNode] + net[minNode][i]):
labels[i] = labels[minNode] + net[minNode][i]
drop1[i] = labels[minNode] + net[minNode][i]
order[i] = minNode
del drop1[minNode] # once a node has been visited, it's excluded from drop1
## end algorithm
# print shortest path
temp = copy(t)
rpath = []
path = []
while 1:
rpath.append(temp)
if temp in order: temp = order[temp] #if order.has_key(temp): temp = order[temp]
else: return "There is no path from " + str(s) + " to " + str(t) + "."
if temp == s:
rpath.append(temp)
break
for j in range(len(rpath)-1,-1,-1):
path.append(rpath[j])
return "The shortest path from " + s + " to " + t + " is " + str(path) + ". Minimum distance is " + str(labels[t]) + "."
# Given a large random network find the shortest path from '0' to '5'
print dijkstra(net, s='0', t='5')
First, I assume this is a homework problem, as the best suggest is to not bother writing it yourself, but to find an existing implementation on the web. Here's one that looks pretty good, for example.
Assuming you do need to reinvent the wheel, the code referenced there uses dictionaries to store the node data. So you feed it something like:
{
's': {'u' : 10, 'x' : 5},
'u': {'v' : 1, 'x' : 2},
'v': {'y' : 4},
'x': {'u' : 3, 'v' : 9, 'y' : 2},
'y': {'s' : 7, 'v' : 6}
}
This seems a more intuitive way of presenting your graph information. Visited nodes and distances can be kept in dictionaries as well.

Categories

Resources