What is taking up the memory in this for-loop? - python

I was playing around with the memory_profiler package (downloaded from pip), more specifically, looking at the memory efficiency of looping through a list by creating a temporary list first vs. looping through an "iterator list".
This was a problem that I encountered a while back and I wanted to benchmark my solution. The problem was that I needed to compare each element in a list with the next element in the same list, until all elements had been "dealt with". So I guess this would be an O(n^2) solution (if the most naive solution is picked, for each element in list, loop through list).
Anyways, the three functions below are all doing the same thing (more or less); looping over a list that is zipped with itself-offset-by-one.
import cProfile
#profile
def zips():
li = range(1,20000000)
for tup in zip(li,li[1:]):
pass
del li
#profile
def izips():
from itertools import izip
li = range(1,20000000)
for tup in izip(li,li[1:]):
pass
del li
#profile
def izips2():
from itertools import izip
li = range(1,20000000)
for tup in izip(li,li[1:]):
del tup
del li
if __name__ == '__main__':
zips()
# izips()
# izips2()
The surprising part (to me) was in the memory usage, first I run the zips() function, and although I thought I did clean up, I still ended up with ~1.5 GB in memory:
ipython -m memory_profiler python_profiling.py
Filename: python_profiling.py
Line # Mem usage Increment Line Contents
================================================
10 #profile
11 27.730 MB 0.000 MB def zips():
12 649.301 MB 621.570 MB li = range(1,20000000)
13 3257.605 MB 2608.305 MB for tup in zip(li,li[1:]):
14 1702.504 MB -1555.102 MB pass
15 1549.914 MB -152.590 MB del li
Then I close the interpreter instance and reopen it for running the next test, which is the izips() function:
ipython -m memory_profiler python_profiling.py
Filename: python_profiling.py
Line # Mem usage Increment Line Contents
================================================
17 #profile
18 27.449 MB 0.000 MB def izips():
19 27.449 MB 0.000 MB from itertools import izip
20 649.051 MB 621.602 MB li = range(1,20000000)
21 1899.512 MB 1250.461 MB for tup in izip(li,li[1:]):
22 1746.922 MB -152.590 MB pass
23 1594.332 MB -152.590 MB del li
And then finally I ran a test (again after restarting the interpreter in between) where I tried to explicitly delete the tuple in the for-loop to try to make sure that its memory would be freed (maybe I'm not thinking that correctly?). Turns out that didn't make a difference so I'm guessing that either I'm not prompting GC or that is not the source of my memory overhead.
ipython -m memory_profiler python_profiling.py
Filename: python_profiling.py
Line # Mem usage Increment Line Contents
================================================
25 #profile
26 20.109 MB 0.000 MB def izips2():
27 20.109 MB 0.000 MB from itertools import izip
28 641.676 MB 621.566 MB li = range(1,20000000)
29 1816.953 MB 1175.277 MB for tup in izip(li,li[1:]):
30 1664.387 MB -152.566 MB del tup
31 1511.797 MB -152.590 MB del li
Bottom line:
I thought that the overhead of the for loop itself was minimal, and therefore, I was expecting just a little bit more than ~620.000 MB (the memory it takes to store the list) but instead it looks like I have ~2 lists of size 20.000.000 in memory + even more overhead. Can anyone help me explain what all this memory is being used for?? (and what is taking up that ~1.5 GB at the end of each run?)

Note that the OS assigns memory in chunks, and doesn't necessarily reclaim it all in one go. I've found the memory profiling package to be wildly inaccurate because it appears it fails to take that into account.
Your li[1:] slice creates a new list with (2*10**7) - 1 elements, nearly a whole new copy, easily doubling the memory space required for the lists. The zip() call also returns a full new list object, the output of the zipping action, again requiring memory for the intermediary result, plus 20 million 2-element tuples.
You could use a new iterator instead of slicing:
def zips():
from itertools import izip
li = range(1,20000000)
next_li = iter(li)
next(next_li) # advance one step
for tup in izip(li, next_li):
pass
del li
The list iterator returned from the iter() call is much more light-weight; it only keeps a reference to the original list and a pointer. Combining this with izip() avoids creating the output list as well.

Related

why do these recursive functions use the same amount of memory? [duplicate]

I was trying to replicate the memory usage test here.
Essentially, the post claims that given the following code snippet:
import copy
import memory_profiler
#profile
def function():
x = list(range(1000000)) # allocate a big list
y = copy.deepcopy(x)
del x
return y
if __name__ == "__main__":
function()
Invoking
python -m memory_profiler memory-profile-me.py
prints, on a 64-bit computer
Filename: memory-profile-me.py
Line # Mem usage Increment Line Contents
================================================
4 #profile
5 9.11 MB 0.00 MB def function():
6 40.05 MB 30.94 MB x = list(range(1000000)) # allocate a big list
7 89.73 MB 49.68 MB y = copy.deepcopy(x)
8 82.10 MB -7.63 MB del x
9 82.10 MB 0.00 MB return y
I copied and pasted the same code but my profiler yields
Line # Mem usage Increment Line Contents
================================================
3 44.711 MiB 44.711 MiB #profile
4 def function():
5 83.309 MiB 38.598 MiB x = list(range(1000000)) # allocate a big list
6 90.793 MiB 7.484 MiB y = copy.deepcopy(x)
7 90.793 MiB 0.000 MiB del x
8 90.793 MiB 0.000 MiB return y
This post could be outdated --- either the profiler package or python could have changed. In any case, my questions are, in Python 3.6.x
(1) Should copy.deepcopy(x) (as defined in the code above) consume a nontrivial amount of memory?
(2) Why couldn't I replicate?
(3) If I repeat x = list(range(1000000)) after del x, would the memory increase by the same amount as I first assigned x = list(range(1000000)) (as in line 5 of my code)?
copy.deepcopy() recursively copies mutable object only, immutable objects such as integers or strings are not copied. The list being copied consists of immutable integers, so the y copy ends up sharing references to the same integer values:
>>> import copy
>>> x = list(range(1000000))
>>> y = copy.deepcopy(x)
>>> x[-1] is y[-1]
True
>>> all(xv is yv for xv, yv in zip(x, y))
True
So the copy only needs to create a new list object with 1 million references, an object that takes a little over 8MB of memory on my Python 3.6 build on Mac OS X 10.13 (a 64-bit OS):
>>> import sys
>>> sys.getsizeof(y)
8697464
>>> sys.getsizeof(y) / 2 ** 20 # Mb
8.294548034667969
An empty list object takes 64 bytes, each reference takes 8 bytes:
>>> sys.getsizeof([])
64
>>> sys.getsizeof([None])
72
Python list objects overallocate space to grow, converting a range() object to a list causes it to make a little more space for additional growth than when using deepcopy, so x is slightly larger still, having room for an additional 125k objects before having to resize again:
>>> sys.getsizeof(x)
9000112
>>> sys.getsizeof(x) / 2 ** 20
8.583175659179688
>>> ((sys.getsizeof(x) - 64) // 8) - 10**6
125006
while the copy only has additional space for left for about 87k:
>>> ((sys.getsizeof(y) - 64) // 8) - 10**6
87175
On Python 3.6 I can't replicate the article claims either, in part because Python has seen a lot of memory management improvements, and in part because the article is wrong on several points.
The behaviour of copy.deepcopy() regarding lists and integers has never changed in the long history of the copy.deepcopy() (see the first revision of the module, added in 1995), and the interpretation of the memory figures is wrong, even on Python 2.7.
Specifically, I can reproduce the results using Python 2.7 This is what I see on my machine:
$ python -V
Python 2.7.15
$ python -m memory_profiler memtest.py
Filename: memtest.py
Line # Mem usage Increment Line Contents
================================================
4 28.406 MiB 28.406 MiB #profile
5 def function():
6 67.121 MiB 38.715 MiB x = list(range(1000000)) # allocate a big list
7 159.918 MiB 92.797 MiB y = copy.deepcopy(x)
8 159.918 MiB 0.000 MiB del x
9 159.918 MiB 0.000 MiB return y
What is happening is that Python's memory management system is allocating a new chunk of memory for additional expansion. It's not that the new y list object takes nearly 93MiB of memory, that's just the additional memory the OS has allocated to the Python process when that process requested some more memory for the object heap. The list object itself is a lot smaller.
The Python 3 tracemalloc module is a lot more accurate about what actually happens:
python3 -m memory_profiler --backend tracemalloc memtest.py
Filename: memtest.py
Line # Mem usage Increment Line Contents
================================================
4 0.001 MiB 0.001 MiB #profile
5 def function():
6 35.280 MiB 35.279 MiB x = list(range(1000000)) # allocate a big list
7 35.281 MiB 0.001 MiB y = copy.deepcopy(x)
8 26.698 MiB -8.583 MiB del x
9 26.698 MiB 0.000 MiB return y
The Python 3.x memory manager and list implementation is smarter than those one in 2.7; evidently the new list object was able to fit into existing already-available memory, pre-allocated when creating x.
We can test Python 2.7's behaviour with a manually built Python 2.7.12 tracemalloc binary and a small patch to memory_profile.py. Now we get more reassuring results on Python 2.7 as well:
Filename: memtest.py
Line # Mem usage Increment Line Contents
================================================
4 0.099 MiB 0.099 MiB #profile
5 def function():
6 31.734 MiB 31.635 MiB x = list(range(1000000)) # allocate a big list
7 31.726 MiB -0.008 MiB y = copy.deepcopy(x)
8 23.143 MiB -8.583 MiB del x
9 23.141 MiB -0.002 MiB return y
I note that the author was confused as well:
copy.deepcopy copies both lists, which allocates again ~50 MB (I am not sure where the additional overhead of 50 MB - 31 MB = 19 MB comes from)
(Bold emphasis mine).
The error here is to assume that all memory changes in the Python process size can directly be attributed to specific objects, but the reality is far more complex, as the memory manager can add (and remove!) memory 'arenas', blocks of memory reserved for the heap, as needed and will do so in larger blocks if that makes sense. The process here is complex, as it depends on interactions between Python's manager and the OS malloc implementation details. The author has found an older article on Python's model that they have misunderstood to be current, the author of that article themselves has already tried to point this out; as of Python 2.5 the claim that Python doesn't free memory is no longer true.
What's troubling, is that the same misunderstandings then lead the author to recommend against using pickle, but in reality the module, even on Python 2, never adds more than a little bookkeeping memory to track recursive structures. See this gist for my testing methodology; using cPickle on Python 2.7 adds a one-time 46MiB increase (doubling the create_file() call results in no further memory increase). In Python 3, the memory changes have gone altogether.
I'll open a dialog with the Theano team about the post, the article is wrong, confusing, and Python 2.7 is soon to be made entirely obsolete anyway so they really should focus on Python 3's memory model. (*)
When you create a new list from range(), not a copy, you'll see a similar increase in memory as for creating x the first time, because you'd create a new set of integer objects in addition to the new list object. Aside from a specific set of small integers, Python doesn't cache and re-use integer values for range() operations.
(*) addendum: I opened issue #6619 with the Thano project. The project agreed with my assessment and removed the page from their documentation, although they haven't yet updated the published version.

Python Memory Leak Using binascii, zlib, struct, and numpy

I have a python script which is processing a large amount of data from compressed ASCII. After a short period, it runs out of memory. I am not constructing large lists or dicts. The following code illustrates the issue:
import struct
import zlib
import binascii
import numpy as np
import psutil
import os
import gc
process = psutil.Process(os.getpid())
n = 1000000
compressed_data = binascii.b2a_base64(bytearray(zlib.compress(struct.pack('%dB' % n, *np.random.random(n))))).rstrip()
print 'Memory before entering the loop is %d MB' % (process.get_memory_info()[0] / float(2 ** 20))
for i in xrange(2):
print 'Memory before iteration %d is %d MB' % (i, process.get_memory_info()[0] / float(2 ** 20))
byte_array = zlib.decompress(binascii.a2b_base64(compressed_data))
a = np.array(struct.unpack('%dB' % (len(byte_array)), byte_array))
gc.collect()
gc.collect()
print 'Memory after last iteration is %d MB' % (process.get_memory_info()[0] / float(2 ** 20))
It prints:
Memory before entering the loop is 45 MB
Memory before iteration 0 is 45 MB
Memory before iteration 1 is 51 MB
Memory after last iteration is 51 MB
Between the first and second iteration, 6 MB of memory get created. If i run the loop more than two times, the memory usage stays at 51 MB. If I put the code to decompress into its own function and feed it the actual compressed data, the memory usage will continue to grow. I am using Python 2.7. Why is the memory increasing and how can it be corrected? Thank you.
Through comments, we figured out what was going on:
The main issue is that variables declared in a for loop are not destroyed once the loop ends. They remain accessible, pointing to the value they received in the last iteration:
>>> for i in range(5):
... a=i
...
>>> print a
4
So here's what's happening:
First iteration: The print is showing 45MB, which the memory before instantiating byte_array and a.
The code instantiates those two lengthy variables, making the memory go to 51MB
Second iteration: The two variables instantiated in the first run of the loop are still there.
In the middle of the second iteration, byte_array and a are overwritten by the new instantiation. The initial ones are destroyed, but substituted by equally lengthy variables.
The for loop ends, but byte_array and a are still accessible in the code, therefore, not destroyed by the second gc.collect() call.
Changing the code to:
for i in xrange(2):
[ . . . ]
byte_array = None
a = None
gc.collect()
made the memory resreved by byte_array and a unaccessible, and therefore, freed.
There's more on Python's garbage collection in this SO answer: https://stackoverflow.com/a/4484312/289011
Also, it may be worth looking at How do I determine the size of an object in Python?. This is tricky, though... if your object is a list pointing to other objects, what is the size? The sum of the pointers in the list? The sum of the size of the objects those pointers point to?

Interpreting the output of python memory_profiler

Please excuse this naive question of mine. I am trying to monitor memory usage of my python code, and have come across the promising memory_profiler package. I have a question about interpreting the output generated by #profile decorator.
Here is a sample output that I get by running my dummy code below:
dummy.py
from memory_profiler import profile
#profile
def my_func():
a = [1] * (10 ** 6)
b = [2] * (2 * 10 ** 7)
del b
return a
if __name__ == '__main__':
my_func()
Calling dummy.py by "python dummy.py" returns the table below.
Line # Mem usage Increment Line Contents
3 8.2 MiB 0.0 MiB #profile
4 def my_func():
5 15.8 MiB 7.6 MiB a = [1] * (10 ** 6)
6 168.4 MiB 152.6 MiB b = [2] * (2 * 10 ** 7)
7 15.8 MiB -152.6 MiB del b
8 15.8 MiB 0.0 MiB return a
My question is what does the 8.2 MiB in the first line of the table correspond to. My guess is that it is the initial memory usage by the python interpreter itself; but I am not sure. If that is the case, is there a way to have this baseline usage automatically subtracted from the memory usage of the script?
Many thanks for your time and consideration!
Noushin
According to the docs:
The first column represents the line number of the code that has been profiled, the second column (Mem usage) the memory usage of the Python interpreter after that line has been executed. The third column (Increment) represents the difference in memory of the current line with respect to the last one.
So, that 8.2 MiB is the memory usage after the first line has been executed. That includes the memory needed to start up Python, load your script and all of its imports (including memory_profiler itself), and so on.
There don't appear to be any documented options for removing that from each entry. But it wouldn't be too hard to post-process the results.
Alternatively, do you really need to do that? The third column shows how much additional memory has been used after each line, and either that, or the sum of that across a range of lines, seems more interesting than the difference between each line's second column and the start.
The difference in memory between lines is given in the second column or you could write a small script to process the output.

Memory for python.exe on Windows 7 python 32 - Numpy uses half of the available memory only?

I'm loading large h5 files into memory using numpy ndarray's. I read that my system (Win 7 prof., 6 GB RAM) is supposed to allow python.exe to use about 2 GB of physical memory.
However I'm getting a MemoryError already just shy of 1 GB. Even stranger this lower limit seems to only apply for numpy array's but not for a list.
I've tested my memory consumption using the following function found here:
import psutil
import gc
import os
import numpy as np
from matplotlib.pyplot import pause
def memory_usage_psutil():
# return the memory usage in MB
process = psutil.Process(os.getpid())
mem = process.get_memory_info()[0]/float(2**20)
return mem
Test 1: Testing memory limits for an ordinary list
print 'Memory - %d MB' %memory_usage_psutil() # prints memory usage after imports
a = []
while 1:
try:
a.append([x*2000 for x in xrange(10000)])
except MemoryError:
print 'Memory - %d MB' %memory_usage_psutil()
a = []
print 'Memory - %d MB' %memory_usage_psutil()
print 'run garbage collector: collected %d objects.' %gc.collect()
print 'Memory - %d MB\n\n' %memory_usage_psutil()
break
Test 1 prints:
Memory - 39 MB
Memory - 1947 MB
Memory - 1516 MB
run garbage collector: collected 0 objects.
Memory - 49 MB
Test 2: Creating a number of large np.array's
shape = (5500,5500)
names = ['b', 'c', 'd', 'g', 'h']
try:
for n in names:
globals()[n] = np.ones(shape, dtype='float64')
print 'created variable %s with %0.2f MB'\
%(n,(globals()[n].nbytes/2.**20))
except MemoryError:
print 'MemoryError, Memory - %d MB. Deleting files..'\
%memory_usage_psutil()
pause(2)
# Just added the pause here to be able to observe
# the spike of memory in the Windows task manager.
for n in names:
globals()[n] = []
print 'Memory - %d MB' %memory_usage_psutil()
print 'run garbage collector: collected %d objects.' %gc.collect()
print 'Memory - %d MB' %memory_usage_psutil()
Test 2 prints:
Memory - 39 MB
created variable b with 230.79 MB
created variable c with 230.79 MB
created variable d with 230.79 MB
created variable g with 230.79 MB
MemoryError, Memory - 964 MB. Deleting files..
Memory - 39 MB
run garbage collector: collected 0 objects.
Memory - 39 MB
My question: Why do I get a MemoryError before I'm even close to the 2GB limit and why is there a difference in memory limits for a list and np.array respectively or what am I missing?
I'm using python 2.7 and numpy 1.7.1
This is probably happening because numpy array is using some C array library (for speed), that is somewhere calling a malloc. This then fails because it cannot allocate a contiguous 1GB of memory. I am further guessing that Python lists are implemented as a linked list, thus the memory needed for a list need not be contiguous. Hence, if you have enough memory available but it is fragmented, your array malloc would fail but your linked list would allow you to use all of the noncontiguous pieces.

how can i release memory caused by "for"

Our game program will initialize the data of all players into the memory. My purpose is to reduce the memory which is not necessary. I traced the program and found that "for" taking a lot of memory.
For example:
Line # Mem usage Increment Line Contents
================================================
52 #profile
53 11.691 MB 0.000 MB def test():
54 19.336 MB 7.645 MB a = ["1"] * (10 ** 6)
55 19.359 MB 0.023 MB print recipe.total_size(a, verbose=False)
56 82.016 MB 62.656 MB for i in a:
57 pass
print recipe.total_size(a, verbose=False):8000098 bytes
The question is How can i release that 62.656 MB memory.
P.S.
Sorry, i know my English is not very well.I will appreciate everyone to read this.:-)
If you are absolutely desperate to reduce memory usage on the loop you can do it this way:
i = 0
while 1:
try:
a[i] #accessing an element here
i += 1
except IndexError:
break
Memory stats (if they are accurate):
12 9.215 MB 0.000 MB i = 0
13 9.215 MB 0.000 MB while 1:
14 60.484 MB 51.270 MB try:
15 60.484 MB 0.000 MB a[i]
16 60.484 MB 0.000 MB i += 1
17 60.484 MB 0.000 MB except IndexError:
18 60.484 MB 0.000 MB break
However, this code looks ugly and danger and reduction in memory usage is just tiny.
1) Instead of list iterator. You should use generator. according to your sample code:
#profile
def test():
a = ("1" for i in range(10**6)) #this will return a generator, instead of a list.
for i in a:
pass
Now if you use the generator 'a' in the for loop, it won't take that much memory.
2) If you are getting a list, then first convert it into generator.
#profile
def test():
a = ["1"] * (10**6) #getting list
g = (i for i in a) #converting list into a generator object
for i in g: #use generator object for iteration
pass
Try this. If is helps you.

Categories

Resources