Free up memory by deleting numpy arrays - python

I have written a fatigue analysis program with a GUI. The program takes strain information for unit loads for each element of a finite element model, reads in a load case using np.genfromtxt('loadcasefilename.txt') and then does some fatigue analysis and saves the result for each element in another array.
The load cases are about 32Mb as text files and there are 40 or so which get read and analysed in a loop. The loads for each element are interpolated by taking slices of the load case array.
The GUI and fatigue analysis run in separate threads. When you click 'Start' on the fatigue analysis it starts the loop over the load cases in the fatigue analysis.
This brings me onto my problem. If I have a lot of elements, the analysis will not finish. How early it quits depends on how many elements there are, which makes me think it might be a memory problem. I've tried fixing this by deleting the load case array at the end of each loop (after deleting all the arrays which are slices of it) and running gc.collect() but this has not had any success.
In MatLab, I'd use the 'pack' function to write the workspace to disk, clear it, and then reload it at the end of each loop. I know this isn't good practice but it would get the job done! Can I do the equivalent in Python somehow?
Code below:
for LoadCaseNo in range(len(LoadCases[0]['LoadCaseLoops'])):#range(1):#xxx
#Get load case data
self.statustext.emit('Opening current load case file...')
LoadCaseFilePath=LoadCases[0]['LoadCasePaths'][LoadCaseNo][0]
#TK: load case paths may be different
try:
with open(LoadCaseFilePath):
pass
except Exception as e:
self.statustext.emit(str(e))
LoadCaseLoops=LoadCases[0]['LoadCaseLoops'][LoadCaseNo,0]
LoadCase=np.genfromtxt(LoadCaseFilePath,delimiter=',')
LoadCaseArray=np.array(LoadCases[0]['LoadCaseLoops'])
LoadCaseArray=LoadCaseArray/np.sum(LoadCaseArray,axis=0)
#Loop through sections
for SectionNo in range(len(Sections)):#range(100):#xxx
SectionCount=len(Sections)
#Get section data
Elements=Sections[SectionNo]['elements']
UnitStrains=Sections[SectionNo]['strains'][:,1:]
Nodes=Sections[SectionNo]['nodes']
rootdist=Sections[SectionNo]['rootdist']
#Interpolate load case data at this section
NeighbourFind=rootdist-np.reshape(LoadCase[0,1:],(1,-1))
NeighbourFind[NeighbourFind<0]=1e100
nearest=np.unravel_index(NeighbourFind.argmin(), NeighbourFind.shape)
nearestcol=int(nearest[1])
Distance0=LoadCase[0,nearestcol+1]
Distance1=LoadCase[0,nearestcol+7]
MxLow=LoadCase[1:,nearestcol+1]
MxHigh=LoadCase[1:,nearestcol+7]
MyLow=LoadCase[1:,nearestcol+2]
MyHigh=LoadCase[1:,nearestcol+8]
MzLow=LoadCase[1:,nearestcol+3]
MzHigh=LoadCase[1:,nearestcol+9]
FxLow=LoadCase[1:,nearestcol+4]
FxHigh=LoadCase[1:,nearestcol+10]
FyLow=LoadCase[1:,nearestcol+5]
FyHigh=LoadCase[1:,nearestcol+11]
FzLow=LoadCase[1:,nearestcol+6]
FzHigh=LoadCase[1:,nearestcol+12]
InterpFactor=(rootdist-Distance0)/(Distance1-Distance0)
Mx=MxLow+(MxHigh-MxLow)*InterpFactor[0,0]
My=MyLow+(MyHigh-MyLow)*InterpFactor[0,0]
Mz=MzLow+(MzHigh-MzLow)*InterpFactor[0,0]
Fx=-FxLow+(FxHigh-FxLow)*InterpFactor[0,0]
Fy=-FyLow+(FyHigh-FyLow)*InterpFactor[0,0]
Fz=FzLow+(FzHigh-FzLow)*InterpFactor[0,0]
#Loop through section coordinates
for ElementNo in range(len(Elements)):
MaterialID=int(Elements[ElementNo,1])
if Materials[MaterialID]['curvefit'][0,0]!=3:
StrainHist=UnitStrains[ElementNo,0]*Mx+UnitStrains[ElementNo,1]*My+UnitStrains[ElementNo,2]*Fz
elif Materials[MaterialID]['curvefit'][0,0]==3:
StrainHist=UnitStrains[ElementNo,3]*Fx+UnitStrains[ElementNo,4]*Fy+UnitStrains[ElementNo,5]*Mz
EndIn=len(StrainHist)
Extrema=np.bitwise_or(np.bitwise_and(StrainHist[1:EndIn-1]<=StrainHist[0:EndIn-2] , StrainHist[1:EndIn-1]<=StrainHist[2:EndIn]),np.bitwise_and(StrainHist[1:EndIn-1]>=StrainHist[0:EndIn-2] , StrainHist[1:EndIn-1]>=StrainHist[2:EndIn]))
Extrema=np.concatenate((np.array([True]),Extrema,np.array([True])),axis=0)
Extrema=StrainHist[np.where(Extrema==True)]
del StrainHist
#Do fatigue analysis
self.statustext.emit('Analysing load case '+str(LoadCaseNo+1)+' of '+str(len(LoadCases[0]['LoadCaseLoops']))+' - '+str(((SectionNo+1)*100)/SectionCount)+'% complete')
del MxLow,MxHigh,MyLow,MyHigh,MzLow,MzHigh,FxLow,FxHigh,FyLow,FyHigh,FzLow,FzHigh,Mx,My,Mz,Fx,Fy,Fz,Distance0,Distance1
gc.collect()

There's obviously a retain cycle or other leak somewhere, but without seeing your code, it's impossible to say more than that. But since you seem to be more interested in workarounds than solutions…
In MatLab, I'd use the 'pack' function to write the workspace to disk, clear it, and then reload it at the end of each loop. I know this isn't good practice but it would get the job done! Can I do the equivalent in Python somehow?
No, Python doesn't have any equivalent to pack. (Of course if you know exactly what set of values you want to keep around, you can always np.savetxt or pickle.dump or otherwise stash them, then exec or spawn a new interpreter instance, then np.loadtxt or pickle.load or otherwise restore those values. But then if you know exactly what set of values you want to keep around, you probably aren't going to have this problem in the first place, unless you've actually hit an unknown memory leak in NumPy, which is unlikely.)
But it has something that may be better. Kick off a child process to analyze each element (or each batch of elements, if they're small enough that the process-spawning overhead matters), send the results back in a file or over a queue, then quit.
For example, if you're doing this:
def analyze(thingy):
a = build_giant_array(thingy)
result = process_giant_array(a)
return result
total = 0
for thingy in thingies:
total += analyze(thingy)
You can change it to this:
def wrap_analyze(thingy, q):
q.put(analyze(thingy))
total = 0
for thingy in thingies:
q = multiprocessing.Queue()
p = multiprocessing.Process(target=wrap_analyze, args=(thingy, q))
p.start()
p.join()
total += q.get()
(This assumes that each thingy and result is both smallish and pickleable. If it's a huge NumPy array, look into NumPy's shared memory wrappers, which are designed to make things much easier when you need to share memory directly between processes instead of passing it.)
But you may want to look at what multiprocessing.Pool can do to automate this for you (and to make it easier to extend the code to, e.g., use all your cores in parallel). Notice that it has a maxtasksperchild parameter, which you can use to recycle the pool processes every, say, 10 thingies, so they don't run out of memory.
But back to actually trying to solve things briefly:
I've tried fixing this by deleting the load case array at the end of each loop (after deleting all the arrays which are slices of it) and running gc.collect() but this has not had any success.
None of that should make any difference at all. If you're just reassigning all the local variables to new values each time through the loop, and aren't keeping references to them anywhere else, then they're just going to get freed up anyway, so you'll never have more than 2 at a (brief) time. And gc.collect() only helps if there are reference cycles. So, on the one hand, it's good news that these had no effect—it means there's nothing obviously stupid in your code. On the other hand, it's bad news—it means that whatever's wrong isn't obviously stupid.
Usually people see this because they keep growing some data structure without realizing it. For example, maybe you vstack all the new rows onto the old version of giant_array instead of onto an empty array, then delete the old version… but it doesn't matter, because each time through the loop, giant_array isn't 5*N, it's 5*N, then 10*N, then 15*N, and so on. (That's just an example of something stupid I did not long ago… Again, it's hard to give more specific examples while knowing nothing about your code.)

Related

Is it possible to fully delete a TextIOWrapper?

Background
=============
Let's say I'm writing some unit tests, and I want to test the re-opening of a log file (it was corrupted for some reason outside or inside my program). I currently have a TextIOWrapper from originally running open(), which I want to fully delete or "clean up". Once it's cleaned up, I want to re-run open(), and I want the ID of that new TextIOWrapper to be something new.
Problem
=============
It seems to re-appear with the same ID. How do I fully clean this thing up? Is it a lost cause for some reason hidden in the docs?
Debug
=============
My actual code has more try/except blocks for various edge cases, but here's the gist:
import gc # I don't want to do this
# create log
log = open("log", "w")
id(log) # result = 01111311110
# close log and delete everything I can think to delete
log.close()
log.__del__()
del log
gc.collect()
# TODO clean up some special way?
# re-open the log
log = open("log", "a")
id(log) # result = 01111311110
Why is that resulting ID still the same?
Theory 1: Due to the way the IO stream works, the TextIOWrapper will end up in the same place in memory for a given file, and my method of testing this function needs re-work.
Theory 2: Somehow I am not properly cleaning this up.
I think you do enough clean up by simply calling log.close(). My hypothesis (now proven see below) is based on the fact that my example below delivers the result you were expecting in the code in your question.
It seems that python reuses the id numbers for some reason.
Try this example:
log = open("log", "w")
print(id(log)) # result = 01111311110
# close log and delete everything I can think to delete
log.close()
log = open("log", "a")
print(id(log))
log.close()
[edit]
I found proof of my hypothesis:
The id is unique only as long as an object is alive. Objects that have no references left to them are removed from memory, allowing the id() value to be re-used for another object, hence the non-overlapping lifetimes wording.
In CPython, id() is the memory address. New objects will be slotted into the next available memory space, so if a specific memory address has enough space to hold the next new object, the memory address will be reused.
The moment all references to an object are gone, the reference count on the object drops to 0 and it is deleted, there and then.
Garbage collection only is needed to break cyclic references, objects that reference one another only, with no further references to the cycle. Because such a cycle will never reach a reference count of 0 without help, the garbage collector periodically checks for such cycles and breaks one of the references to help clear those objects from memory.
more info on Python's reuse of id values at How unique is Python's id()?

Python for loop iteration not completing, skipping steps

I got a 'wrong answer' error in my "queue from stack" algorithm when I expected it to work. For those not familiar with the algorithm, the solution requires two stacks of list type- a "push stack" and a 'pop stack', which is in effect a queue buffer that the push stack dumps itself into when the queue stack gets called and is empty. See if you can determine what's going on and where the problem is.
def pop(self):
self.stack_to_push_to= [1,2] # sample hard coding
self.queue_to_pop = [] # sample hard coding
if self.queue_to_pop == 0: # trigger a dump to form a new queue buffer
for _ in stack_to_push_to:
self.queue_to_pop.append(self.stack_to_push_to.pop())
print(self.queue_to_pop) # [2] but expected [2,1]
Too much was being done on the append line, clever and concise though it seemed to be. When one pops from an iterable that is currently being iterated on, it confuses the count and python iterator process, or that's what I'm guessing is happening. A similar thing happens in excel when you delete rows while at the same time traversing down them (not up them though). I just assumed Python would have been able to handle this on it's own somehow.
Problematic code.
if self.queue_to_pop == 0: # trigger a dump to form a new sub-queue
for _ in stack_to_push_to:
self.queue_to_pop.append(self.stack_incoming.pop()) #!!!!!!
The pop method served the interest of automatically working backwards, which I want, and I also thought it would save time and space complexity, but it ultimately didn't work and I found two alternatives that do. I'd be interested in learning if there's a way to reduce the complexities of my solutions.
Option 1:
for i in range(len(self.stack_to_push_to)-1,-1,-1):
self.queue_buffer.append(self.stack_to_push_to[i])
self.stack_to_push_to = []
Option 2:
for _ in reversed(self.stack_to_push_to):
self.queue_buffer.append(_)
self.stack_to_push_to = []
I didn't see anyone else post this issue so I thought it was worth sharing and hope it enlightens others.

Why is assigning whole Blender array so much faster than assigning through iteration?

I have two functions that both generate random noise but one of them takes several seconds to accomplish and another runs in miliseconds. On the hardware level, where does the difference come from? When I use the second function, are instructions executed in parallel? or is the whole array in memory moved to the corresponding memory array where the image is stored?
from random import random
def slow_random():
for i in range(len(bpy.data.images['img'].pixels)):# < pixels are just a list of pixesl color values, where 0 <= value <= 1
bpy.data.images['img'].pixels[i] = random()
def quick_random():
rand = [random() for i in range(len(bpy.data.images['img'].pixels))]
bpy.data.images['img'].pixels[0:] = rand
Here's a link to my blend file. I don't know whether it's safe. I can post a similar question on BlenderStack and upload a file there only in 1.5 hours. download, unpack, and launch blender from here (takes minutes). Open the file, in the script window, you can press Alt+P to run it. Uncomment functions at the end of the list. The slow function is going to simply freeze the application.
I think, #cephner is right in saying that this is due to the implementation of the __setitem__ method. Because it must access the pixel, change its value, and redraw it.

python seek function return empty string on dynamic updated data

I am using seek function to extract new lines in an updated file. My code looks like this:
read_data=open('path-to-myfile','r')
read_data.seek(0,2)
while True:
time.sleep(sometime)
new_data=read_data.readlines()
do something with new_data
myfile is a csv file that will be constantly updated
The problem is that usually after several loops inside the while, new_data return nothing. It can be different loop numbers. While I checked myfile, it is still updating..... So any problem I have on my code ? Or is there any other way to do this ?
Any help appreciated !!
You have two programs accessing the same file on disk? If that is the case, then the resource may be locking. I set up an example script that writes to a file, and another file that reads for changes based on the code you provided.
So in one instance of python:
import time
while True:
time.sleep(2)
with open('test.txt','a') as read_data:
read_data.seek(0,2)
read_data.write("bibbity boopity\n")
And in another instance of python
import time
read_data=open('test.txt','r')
read_data.seek(0,2)
while True:
time.sleep(1)
new_data=read_data.readlines()
print(new_data)
In this case, the resource is updating slower than its being read, so changes printed by the bottom prog will be blank. But if I speed up the changes per second, well I still see them. But there are some instances where not all the updates are seen.
You may want to use asynchronous file reading to catch all the changes. Python 3 asyncio library doesn't support async file read/write, but curio does.
See also this question

Pause Python Generator

I have a python generator that does work that produces a large amount of data, which uses up a lot of ram. Is there a way of detecting if the processed data has been "consumed" by the code which is using the generator, and if so, pause until it is consumed?
def multi_grab(urls,proxy=None,ref=None,xpath=False,compress=True,delay=10,pool_size=50,retries=1,http_obj=None):
if proxy is not None:
proxy = web.ProxyManager(proxy,delay=delay)
pool_size = len(pool_size.records)
work_pool = pool.Pool(pool_size)
partial_grab = partial(grab,proxy=proxy,post=None,ref=ref,xpath=xpath,compress=compress,include_url=True,retries=retries,http_obj=http_obj)
for result in work_pool.imap_unordered(partial_grab,urls):
if result:
yield result
run from:
if __name__ == '__main__':
links = set(link for link in grab('http://www.reddit.com',xpath=True).xpath('//a/#href') if link.startswith('http') and 'reddit' not in link)
print '%s links' % len(links)
counter = 1
for url, data in multi_grab(links,pool_size=10):
print 'got', url, counter, len(data)
counter += 1
A generator simply yields values. There's no way for the generator to know what's being done with them.
But the generator also pauses constantly, as the caller does whatever it does. It doesn't execute again until the caller invokes it to get the next value. It doesn't run on a separate thread or anything. It sounds like you have a misconception about how generators work. Can you show some code?
The point of a generator in Python is to get rid of extra, unneeded objects after each iteration. The only time it will keep those extra objects (and thus extra ram) is when the objects are being referenced somewhere else (such as adding them to a list). Make sure you aren't saving these variables unnecessarily.
If you're dealing with multithreading/processing, then you probably want to implement a Queue that you could pull data from, keeping track of the number of tasks you're processing.
I think you may be looking for the yield function. Explained in another StackOverflow question: What does the "yield" keyword do in Python?
A solution could be to use a Queue to which the generator would add data, while another part of the code would get data from it and process it. This way you could ensure that there is no more than n items in memory at the same time.

Categories

Resources