python numpy array memory error in loop

python numpy array memory error in loop - python

I'm experiencing a very weird problem when using a large numpy array. Here's the basic context. I have about 15 lists of paired objects which I am constructing adjacency matrices for. Each adjacency matrix is about 4000 x 4000 (square matrix where the diagonal means the object is paired with itself) so it's big but not too big. Here's the basic setup of my code:
def createAdjacencyMatrix(pairedObjectList, objectIndexList):
N = len(objectIndexList)
adj = numpy.zeros((N,N))
for i in range(0, len(pairedObjectList):
#put a 1 in the correct row/column position for the pair etc.
return adj
In my script I call this function about 15 times, one for each paired object list. However every time I run it I get this error:
adj = np.zeros((N,N))
MemoryError
I really don't understand where the memory error is coming from. Even though I'm making this big matrix, it only exists within the scope of that function, so shouldn't it be cleared from memory every time the function is finished? Not to mention, if the same variable is hanging around in memory, then shouldn't it just overwrite those memory positions?
Any help understanding this much appreciated.
EDIT : Here's the full output of the traceback
Traceback (most recent call last):
File "create_biogrid_adjacencies.py", line 119, in <module>
adjMat = dsu.createAdjacencyMatrix(proteinList,idxRefDict)
File "E:\Matt\Documents\Research\NU\networks_project\data_setup_utils.py", line 18, in createAdjacencyMatrix
adj = np.zeros((N,N))
MemoryError

Related

Python: 'float' object is not subscriptable when iteration function

I am teaching myself Python and am trying out a challenge I found to create a quote program for a gardener. I have almost all of it working and have added in iteration so that the user can make more than one quote without re-starting the program.
It produces the quote perfectly the first time but on the second run it presents this error:
Traceback (most recent call last):
File "/Users/shaunrogers/Desktop/Plymstock Prep/GCSE CS/SOL/Controlled Assessment/Sample Papers Solutions/gardening Task 2.py", line 105, in <module>
lawn = m2_items("lawn",0)
File "/Users/shaunrogers/Desktop/Plymstock Prep/GCSE CS/SOL/Controlled Assessment/Sample Papers Solutions/gardening Task 2.py", line 23, in m2_items
minutes = area*time[index]
TypeError: 'float' object is not subscriptable
I have the following code as a function that is producing the error:
def m2_items (item,index):
global costs, time, LABOUR
length = int(input("How long is the "+ item+"?\n"))
width = int(input("How wide is the "+item+"?\n"))
area = length*width
cost_m2 = float(costs[index])
total_cost = area*cost_m2
minutes = area*time[index]
hours = int(minutes/60)
labour = LABOUR*hours
labour_cost=round(labour,2)
m2_details = [area, cost_m2, total_cost,hours, labour_cost]
return m2_details
I have tried re-setting the local variables on the running of the function (but I didn't think this was needed as the variables should be removed from memory once the function has run).
I hope the question is clear and that I can get some insight. To re-iterate, what I want the program to do is allow me to call this function multiple times.

You are using the global time variable, which is initially subscriptable (probably an array). As your program continues, some other part of your code will assign a new value to time, maybe accidentally because you wrote time = some_calculation() instead of time[i] = some_calculation(), or maybe you are using the name time somewhere else without realizing it's already in use.
Do a search for all the places where you use the name time and you will probably find your error.
This is a common problem with global variables. Sometimes something updates them from another part of the code, and the error will sneak up on you like this.

Race condition in line of Python

I have an interesting problem. I am -- for shits and giggles -- trying to write a program really shortly. I have it down to 2 lines, but it has a race condition, and I can't figure out why. Here's the gist of it:
imports...
...[setattr(__main__, 'f', [1, 2, ..]), reduce(...random.choice(f)...)][1]...
Every once in a while, the following exception will be generated. But NOT always. That's my problem. I suspect that the order of execution is not guaranteed especially since I'm using the list trick -- I would assume that maybe the interpreter can predict that setattr() returns None and knows that I'm only selecting the 2nd thing in the list, so it defers the actual setattr() to later. But it only happens sometimes. Any ideas? Does CPython automatically thread some things like map, filter, reduce calls?
Traceback (most recent call last):
File "/usr/lib64/python3.4/random.py", line 253, in choice
i = self._randbelow(len(seq))
File "/usr/lib64/python3.4/random.py", line 230, in _randbelow
r = getrandbits(k) # 0 <= r < 2**k
ValueError: number of bits must be greater than zero
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "test4.py", line 2, in <module>
print(" ".join([setattr(n,'f',open(sys.argv[1],"r").read().replace("\n"," ").split(" ")),setattr(n,'m',c.defaultdict(list)),g.reduce(lambda p,e:p+[r.choice(m[p[-1]])],range(int(sys.argv[2])),[r.choice(list(filter(lambda x:[m[x[0]].append(x[1]),x[0].isupper()][1],zip(f[:-1],f[1:]))))[0]])][2]))
File "test4.py", line 2, in <lambda>
print(" ".join([setattr(n,'f',open(sys.argv[1],"r").read().replace("\n"," ").split(" ")),setattr(n,'m',c.defaultdict(list)),g.reduce(lambda p,e:p+[r.choice(m[p[-1]])],range(int(sys.argv[2])),[r.choice(list(filter(lambda x:[m[x[0]].append(x[1]),x[0].isupper()][1],zip(f[:-1],f[1:]))))[0]])][2]))
File "/usr/lib64/python3.4/random.py", line 255, in choice
raise IndexError('Cannot choose from an empty sequence')
IndexError: Cannot choose from an empty sequence
I've tried modifying globals() and vars() insetad of using setattr(), but that does not seem to help (same exception sequence).
Here's the actual code:
import sys,collections as c,random as r,functools as g,__main__ as n
print(" ".join([setattr(n,'f',open(sys.argv[1],"r").read().replace("\n"," ").split(" ")),setattr(n,'m',c.defaultdict(list)),g.reduce(lambda p,e:p+[r.choice(m[p[-1]])],range(int(sys.argv[2])),[r.choice(list(filter(lambda x:[m[x[0]].append(x[1]),x[0].isupper()][1],zip(f[:-1],f[1:]))))[0]])][2]))
If you're curious: This is to read in a text file, generate a Markov model, and spit out a sentence.

random.choice()
Well, of course that is nondeterministic. If you are very careful, you could set the seed of the pseudo-random number generator to something constant, and hope that's fabricates the same sequence every time. There's a good chance it will work.
random.seed(42); ...

Alright, here's what actually happened: In my sentence generation, I sometimes hit the last word in the file (which in some cases, depending on the file, does not have a possible successor state). Hence, I'm trying to choose from an empty list in that case.

TypeError in Numpy happening randomly

I'm having a weird issue with numpy right now on a current assignment. I'm making an ant colony AI for a class (based on the Google AI Ants Challenge), and I'm using a diffusion based approach where I essentially diffuse out a scent from food/enemy hills on each turn. I've been using numpy since each turn basically consists of doing a lot of matrix manipulations, but I just recently got a weird bug that I can't figure out.
At the beginning of each turn, I update the field associated with each scent before I run the diffusion iterations:
# Here I update the "potential" field (hills_f) and the
# diffusion values (hills_l) for the hills scent. Diffusion values
# (lambda values) are 1 except for on ants, where they are higher
# or lower depending on their colony.
self.hills_f *= TURN_DECAY
self.hills_l = np.ones_like(self.hills_l)
# Update the lambda matrix
for r,c in ants.my_ants():
self.hills_l[r][c] = MY_HILLS_LAMBDA
for r,c in ants.enemy_ants():
self.hills_l[r][c] = ENEMY_HILLS_LAMBDA
So this code runs at the beginning of each turn (along with a similar snippet for the food scent), but on a random turn (ranges from 10-40), I get the following error:
Traceback (most recent call last):
File "long_file_path...", line 167, in run
bot.do_turn(ants)
File "MyBot.py", line 137, in do_turn
self.hills_l[r][c] = ENEMY_HILLS_LAMBDA
TypeError: 'numpy.float64' object does not support item assignment
It looks like it randomly turns self.hills_l into a scalar between the two for-loops, which doesn't make any sense to me. It's also weird that there's similar code for the food scent, which doesn't crash ever, and that this problem shows up so non-deterministically.
I can post more code if necessary, but I think everything should be there, especially since the problem seems to occur between the for loops.
Thanks!

I didnt get my pacman to reach to its destination.. nd having an error mentioned below

[SearchAgent] using function depthFirstSearch
[SearchAgent] using problem type PositionSearchProblem
Path found with total cost of 999999 in 0.0 seconds
Search nodes expanded: 1
Traceback (most recent call last):
File "C:\Documents and Settings\vpn\My Documents\Aptana Studio 3 Workspace\Project 1 - Search\pacman.py", line 672, in <module>
runGames( **args )
File "C:\Documents and Settings\vpn\My Documents\Aptana Studio 3 Workspace\Project 1 - Search\pacman.py", line 638, in runGames
game.run()
File "C:\Documents and Settings\vpn\My Documents\Aptana Studio 3 Workspace\Project 1 - Search\game.py", line 662, in run
action = agent.getAction(observation)
File "C:\Documents and Settings\vpn\My Documents\Aptana Studio 3 Workspace\Project 1 - Search\searchAgents.py", line 121, in getAction
if i < len(self.actions):
TypeError: object of type 'NoneType' has no len()
m working on a project in which i have to test my pacman agent to reach to its destination in less possible space and time complexity by applying DFS(Depth First Algorithm)
My code for it is
stack = util.Stack()
explored = list()
start = problem.getStartState()
for item in problem.getSuccessors(start):
state = item[0]
path = list()
path.append(item[1])
stateInfo = (state, path)
stack.push(stateInfo)
explored.append(start)
while stack.isEmpty():
state = stack.pop()
if problem.isGoalState(state[0]):
return state[1]
for states in problem.getSuccessor(state[0]):
newstate = states[0]
newpath = list(state[1])
newpath.append(states[1])
newstateInfo = (newstate, newpath)
stack.push(newstateInfo)
explored.append(state[0])
What should i suppose to do now.. my pacman agent get stuck to its starting position in east dirction opposite to its destination.
The supporting files to run the agent is mentioned in https://www.edx.org/courses/BerkeleyX/CS188.1x/2012_Fall/courseware/Week_2/Project_1_Search/

I think I know what is going on (though I can't be certain given the limited amount of code in the question).
My guess is that the code you've posted is getting called to set up the variable self.actions where you're getting an error. The error is happening because you're getting None returned from the function, rather than the list you expect.
The underlying bug is that your main loop has it's test backwards. You want while not stack.isEmpty(), rather than what you have. Because you've just pushed several values onto the stack, the loop as written exits immediately. After the loop you reach the end of the function, which is equivalent to returning None in Python. That None is causing the exception later.
Even if you fix the broken loop, falling off the end of the function can happen anyway if a path cannot be found to the goal state. I suggest adding code to detect this, and respond appropriately. You could raise an exception (since it shouldn't happen in normal situations), or maybe return an empty list.
There's another issue where you're not checking if a newly visited state has already been explored. This could lead to an infinite loop, as you search from one state to a neighbor and then back. A set will also be better than a list for explored. You can check it with if state in explored (which also works on lists, but less efficiently).
As a side issue, some of your variables are very poorly named. state different types at different times, and having a second variable states makes it even more confusing, especially with indexing thrown in.
I suggest unpacking the tuples you pushed onto the stack with the code state, path = stack.pop(), which will solve part of the issue. Then just rename states to something else (or perhaps unpack it too, with something like for neighbor, direction in problem.getSuccessor(state)) and you'll be good to go.
You can also save a few lines of code by pushing the start state to your stack, rather than doing an extra loop over its neighbors in the setup code.

Python/Numpy MemoryError

Basically, I am getting a memory error in python when trying to perform an algebraic operation on a numpy matrix. The variable u, is a large matrix of double (in the failing case its a 288x288x156 matrix of doubles. I only get this error in this huge case, but I am able to do this on other large matrices, just not this big). Here is the Python error:
Traceback (most recent call last):
File "S:\3D_Simulation_Data\Patient SPM Segmentation\20 pc
t perim erosion flattop\SwSim.py", line 121, in __init__
self.mainSimLoop()
File "S:\3D_Simulation_Data\Patient SPM Segmentation\20 pc
t perim erosion flattop\SwSim.py", line 309, in mainSimLoop
u = solver.solve_cg(u,b,tensors,param,fdHold,resid) # Solve the left hand si
de of the equation Au=b with conjugate gradient method to approximate u
File "S:\3D_Simulation_Data\Patient SPM Segmentation\20 pc
t perim erosion flattop\conjugate_getb.py", line 47, in solv
e_cg
u = u + alpha*p
MemoryError
u = u + alpha*p is the line of code that fails.
alpha is just a double, while u and r are the large matrices described above (both of the same size).
I don't know that much about memory errors especially in Python. Any insight/tips into solving this would be very appreciated!
Thanks

Rewrite to
p *= alpha
u += p
and this will use much less memory. Whereas p = p*alpha allocates a whole new matrix for the result of p*alpha and then discards the old p; p*= alpha does the same thing in place.
In general, with big matrices, try to use op= assignment.

Another tip I have found to avoid memory errors is to manually control garbage collection. When objects are deleted or go our of scope, the memory used for these variables isn't freed up until a garbage collection is performed. I have found with some of my code using large numpy arrays that I get a MemoryError, but that I can avoid this if I insert calls to gc.collect() at appropriate places.
You should only look into this option if using "op=" style operators etc doesn't solve your problem as it's probably not the best coding practice to have gc.collect() calls everywhere.

Your matrix has 288x288x156=12,939,264 entries, which for double could come out to 400MB in memory. numpy throwing a MemoryError at you just means that in the function you called the memory needed to perform the operation wasn't available from the OS.
If you can work with sparse matrices this might save you a lot of memory.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

python numpy array memory error in loop - python

Related

Python: 'float' object is not subscriptable when iteration function

Race condition in line of Python

TypeError in Numpy happening randomly

I didnt get my pacman to reach to its destination.. nd having an error mentioned below

Python/Numpy MemoryError

Categories

Resources