I know that pdb is an interactive system and it is very helpful.
My ultimate goal is to gather all memory states after executing each command in certain function, command by command. For example, with a code snippet
0: def foo() :
1: if True:
2: x=1
3: else:
4: x=2
5: x
then the memory state of each command is
0: empty
1: empty
2: x = 1
3: x = 1
4: (not taken)
5: x = 1
To do this, what I'd like to do with pdb is to write a script that interact with pdb class. I know that s is a function to step forward in statements and print var(in the above case, var is x) is a function to print the value of certain variable. I can gather variables at each command. Then, I want to run a script like below:
import pdb
pdb.run('foo()')
while(not pdb.end()):
pdb.s()
pdb.print('x')
But I cannot find any way how to implement this functionality. Can anybody help me??
Try memory_profiler:
The line-by-line memory usage mode is used much in the same way of the
line_profiler: first decorate the function you would like to profile
with #profile and then run the script with a special script (in this
case with specific arguments to the Python interpreter).
Line # Mem usage Increment Line Contents
==============================================
3 #profile
4 5.97 MB 0.00 MB def my_func():
5 13.61 MB 7.64 MB a = [1] * (10 ** 6)
6 166.20 MB 152.59 MB b = [2] * (2 * 10 ** 7)
7 13.61 MB -152.59 MB del b
8 13.61 MB 0.00 MB return a
Or Heapy:
The aim of Heapy is to support debugging and optimization regarding
memory related issues in Python programs.
Partition of a set of 132527 objects. Total size = 8301532 bytes.
Index Count % Size % Cumulative % Kind (class / dict of class)
0 35144 27 2140412 26 2140412 26 str
1 38397 29 1309020 16 3449432 42 tuple
2 530 0 739856 9 4189288 50 dict (no owner)
Related
I was trying to replicate the memory usage test here.
Essentially, the post claims that given the following code snippet:
import copy
import memory_profiler
#profile
def function():
x = list(range(1000000)) # allocate a big list
y = copy.deepcopy(x)
del x
return y
if __name__ == "__main__":
function()
Invoking
python -m memory_profiler memory-profile-me.py
prints, on a 64-bit computer
Filename: memory-profile-me.py
Line # Mem usage Increment Line Contents
================================================
4 #profile
5 9.11 MB 0.00 MB def function():
6 40.05 MB 30.94 MB x = list(range(1000000)) # allocate a big list
7 89.73 MB 49.68 MB y = copy.deepcopy(x)
8 82.10 MB -7.63 MB del x
9 82.10 MB 0.00 MB return y
I copied and pasted the same code but my profiler yields
Line # Mem usage Increment Line Contents
================================================
3 44.711 MiB 44.711 MiB #profile
4 def function():
5 83.309 MiB 38.598 MiB x = list(range(1000000)) # allocate a big list
6 90.793 MiB 7.484 MiB y = copy.deepcopy(x)
7 90.793 MiB 0.000 MiB del x
8 90.793 MiB 0.000 MiB return y
This post could be outdated --- either the profiler package or python could have changed. In any case, my questions are, in Python 3.6.x
(1) Should copy.deepcopy(x) (as defined in the code above) consume a nontrivial amount of memory?
(2) Why couldn't I replicate?
(3) If I repeat x = list(range(1000000)) after del x, would the memory increase by the same amount as I first assigned x = list(range(1000000)) (as in line 5 of my code)?
copy.deepcopy() recursively copies mutable object only, immutable objects such as integers or strings are not copied. The list being copied consists of immutable integers, so the y copy ends up sharing references to the same integer values:
>>> import copy
>>> x = list(range(1000000))
>>> y = copy.deepcopy(x)
>>> x[-1] is y[-1]
True
>>> all(xv is yv for xv, yv in zip(x, y))
True
So the copy only needs to create a new list object with 1 million references, an object that takes a little over 8MB of memory on my Python 3.6 build on Mac OS X 10.13 (a 64-bit OS):
>>> import sys
>>> sys.getsizeof(y)
8697464
>>> sys.getsizeof(y) / 2 ** 20 # Mb
8.294548034667969
An empty list object takes 64 bytes, each reference takes 8 bytes:
>>> sys.getsizeof([])
64
>>> sys.getsizeof([None])
72
Python list objects overallocate space to grow, converting a range() object to a list causes it to make a little more space for additional growth than when using deepcopy, so x is slightly larger still, having room for an additional 125k objects before having to resize again:
>>> sys.getsizeof(x)
9000112
>>> sys.getsizeof(x) / 2 ** 20
8.583175659179688
>>> ((sys.getsizeof(x) - 64) // 8) - 10**6
125006
while the copy only has additional space for left for about 87k:
>>> ((sys.getsizeof(y) - 64) // 8) - 10**6
87175
On Python 3.6 I can't replicate the article claims either, in part because Python has seen a lot of memory management improvements, and in part because the article is wrong on several points.
The behaviour of copy.deepcopy() regarding lists and integers has never changed in the long history of the copy.deepcopy() (see the first revision of the module, added in 1995), and the interpretation of the memory figures is wrong, even on Python 2.7.
Specifically, I can reproduce the results using Python 2.7 This is what I see on my machine:
$ python -V
Python 2.7.15
$ python -m memory_profiler memtest.py
Filename: memtest.py
Line # Mem usage Increment Line Contents
================================================
4 28.406 MiB 28.406 MiB #profile
5 def function():
6 67.121 MiB 38.715 MiB x = list(range(1000000)) # allocate a big list
7 159.918 MiB 92.797 MiB y = copy.deepcopy(x)
8 159.918 MiB 0.000 MiB del x
9 159.918 MiB 0.000 MiB return y
What is happening is that Python's memory management system is allocating a new chunk of memory for additional expansion. It's not that the new y list object takes nearly 93MiB of memory, that's just the additional memory the OS has allocated to the Python process when that process requested some more memory for the object heap. The list object itself is a lot smaller.
The Python 3 tracemalloc module is a lot more accurate about what actually happens:
python3 -m memory_profiler --backend tracemalloc memtest.py
Filename: memtest.py
Line # Mem usage Increment Line Contents
================================================
4 0.001 MiB 0.001 MiB #profile
5 def function():
6 35.280 MiB 35.279 MiB x = list(range(1000000)) # allocate a big list
7 35.281 MiB 0.001 MiB y = copy.deepcopy(x)
8 26.698 MiB -8.583 MiB del x
9 26.698 MiB 0.000 MiB return y
The Python 3.x memory manager and list implementation is smarter than those one in 2.7; evidently the new list object was able to fit into existing already-available memory, pre-allocated when creating x.
We can test Python 2.7's behaviour with a manually built Python 2.7.12 tracemalloc binary and a small patch to memory_profile.py. Now we get more reassuring results on Python 2.7 as well:
Filename: memtest.py
Line # Mem usage Increment Line Contents
================================================
4 0.099 MiB 0.099 MiB #profile
5 def function():
6 31.734 MiB 31.635 MiB x = list(range(1000000)) # allocate a big list
7 31.726 MiB -0.008 MiB y = copy.deepcopy(x)
8 23.143 MiB -8.583 MiB del x
9 23.141 MiB -0.002 MiB return y
I note that the author was confused as well:
copy.deepcopy copies both lists, which allocates again ~50 MB (I am not sure where the additional overhead of 50 MB - 31 MB = 19 MB comes from)
(Bold emphasis mine).
The error here is to assume that all memory changes in the Python process size can directly be attributed to specific objects, but the reality is far more complex, as the memory manager can add (and remove!) memory 'arenas', blocks of memory reserved for the heap, as needed and will do so in larger blocks if that makes sense. The process here is complex, as it depends on interactions between Python's manager and the OS malloc implementation details. The author has found an older article on Python's model that they have misunderstood to be current, the author of that article themselves has already tried to point this out; as of Python 2.5 the claim that Python doesn't free memory is no longer true.
What's troubling, is that the same misunderstandings then lead the author to recommend against using pickle, but in reality the module, even on Python 2, never adds more than a little bookkeeping memory to track recursive structures. See this gist for my testing methodology; using cPickle on Python 2.7 adds a one-time 46MiB increase (doubling the create_file() call results in no further memory increase). In Python 3, the memory changes have gone altogether.
I'll open a dialog with the Theano team about the post, the article is wrong, confusing, and Python 2.7 is soon to be made entirely obsolete anyway so they really should focus on Python 3's memory model. (*)
When you create a new list from range(), not a copy, you'll see a similar increase in memory as for creating x the first time, because you'd create a new set of integer objects in addition to the new list object. Aside from a specific set of small integers, Python doesn't cache and re-use integer values for range() operations.
(*) addendum: I opened issue #6619 with the Thano project. The project agreed with my assessment and removed the page from their documentation, although they haven't yet updated the published version.
I am trying to get an understanding of using memory_profiler on my python app.
referring to the Python memory profile guide I copied the following code snippet :-
from memory_profiler import profile
#profile
def my_func():
a = [1] * (10 ** 6)
b = [2] * (2 * 10 ** 7)
del b
return a
The expected result according to the link is :-
Line # Mem usage Increment Line Contents
==============================================
3 #profile
4 5.97 MB 0.00 MB def my_func():
5 13.61 MB 7.64 MB a = [1] * (10 ** 6)
6 166.20 MB 152.59 MB b = [2] * (2 * 10 ** 7)
7 13.61 MB -152.59 MB del b
8 13.61 MB 0.00 MB return a
But when i ran my it on my VM running Ubuntu 16.04 I got the following results instead :-
Line # Mem usage Increment Line Contents
================================================
3 35.4 MiB 35.4 MiB #profile
4 def my_func():
5 43.0 MiB 7.7 MiB a = [1] * (10 ** 6)
6 195.7 MiB 152.6 MiB b = [2] * (2 * 10 ** 7)
7 43.1 MiB -152.5 MiB del b
8 43.1 MiB 0.0 MiB return a
There seems to be a huge overhead of around 30MiB difference between the expected and my run. I am trying to get an understanding of where this comes from and if I am doing anything incorrect. Should I be worried about it?
Please advice if anyone have any idea. Thanks
EDIT:
O/S : Ubuntu 16.06.4 (Xenial) running inside a VM
Python : Python 3.6.4 :: Anaconda, Inc.
The memory taken by a list, or an integer heavily depends on the python version/build.
For instance, in python 3, all integers are long integers, whereas in python 2, long is used only when the integer doesn't fit in a CPU register / C int.
On my machine, python 2:
>>> sys.getsizeof(2)
24
Python 3.6.2:
>>> sys.getsizeof(2)
28
When computing your ratios vs the 24/28 ratio, it's pretty close:
>>> 195.7/166.2
1.177496991576414
>>> 28/24
1.1666666666666667
(this is probably not the only difference, but that's the most obvious I can think of)
So no, as long as the results are proportional, you shouldn't worry, but if you have memory issues with python integers (python 3, that is), you could use alternatives, like numpy or other native integer types.
Please excuse this naive question of mine. I am trying to monitor memory usage of my python code, and have come across the promising memory_profiler package. I have a question about interpreting the output generated by #profile decorator.
Here is a sample output that I get by running my dummy code below:
dummy.py
from memory_profiler import profile
#profile
def my_func():
a = [1] * (10 ** 6)
b = [2] * (2 * 10 ** 7)
del b
return a
if __name__ == '__main__':
my_func()
Calling dummy.py by "python dummy.py" returns the table below.
Line # Mem usage Increment Line Contents
3 8.2 MiB 0.0 MiB #profile
4 def my_func():
5 15.8 MiB 7.6 MiB a = [1] * (10 ** 6)
6 168.4 MiB 152.6 MiB b = [2] * (2 * 10 ** 7)
7 15.8 MiB -152.6 MiB del b
8 15.8 MiB 0.0 MiB return a
My question is what does the 8.2 MiB in the first line of the table correspond to. My guess is that it is the initial memory usage by the python interpreter itself; but I am not sure. If that is the case, is there a way to have this baseline usage automatically subtracted from the memory usage of the script?
Many thanks for your time and consideration!
Noushin
According to the docs:
The first column represents the line number of the code that has been profiled, the second column (Mem usage) the memory usage of the Python interpreter after that line has been executed. The third column (Increment) represents the difference in memory of the current line with respect to the last one.
So, that 8.2 MiB is the memory usage after the first line has been executed. That includes the memory needed to start up Python, load your script and all of its imports (including memory_profiler itself), and so on.
There don't appear to be any documented options for removing that from each entry. But it wouldn't be too hard to post-process the results.
Alternatively, do you really need to do that? The third column shows how much additional memory has been used after each line, and either that, or the sum of that across a range of lines, seems more interesting than the difference between each line's second column and the start.
The difference in memory between lines is given in the second column or you could write a small script to process the output.
NB: This is my first foray into memory profiling with Python, so perhaps I'm asking the wrong question here. Advice re improving the question appreciated.
I'm working on some code where I need to store a few million small strings in a set. This, according to top, is using ~3x the amount of memory reported by heapy. I'm not clear what all this extra memory is used for and how I can go about figuring out whether I can - and if so how to - reduce the footprint.
memtest.py:
from guppy import hpy
import gc
hp = hpy()
# do setup here - open files & init the class that holds the data
print 'gc', gc.collect()
hp.setrelheap()
raw_input('relheap set - enter to continue') # top shows 14MB resident for python
# load data from files into the class
print 'gc', gc.collect()
h = hp.heap()
print h
raw_input('enter to quit') # top shows 743MB resident for python
The output is:
$ python memtest.py
gc 5
relheap set - enter to continue
gc 2
Partition of a set of 3197065 objects. Total size = 263570944 bytes.
Index Count % Size % Cumulative % Kind (class / dict of class)
0 3197061 100 263570168 100 263570168 100 str
1 1 0 448 0 263570616 100 types.FrameType
2 1 0 280 0 263570896 100 dict (no owner)
3 1 0 24 0 263570920 100 float
4 1 0 24 0 263570944 100 int
So in summary, heapy shows 264MB while top shows 743MB. What's using the extra 500MB?
Update:
I'm running 64 bit python on Ubuntu 12.04 in VirtualBox in Windows 7.
I installed guppy as per the answer here:
sudo pip install https://guppy-pe.svn.sourceforge.net/svnroot/guppy-pe/trunk/guppy
Can some body help me as how to find how much time and how much memory does it take for a code in python?
Use this for calculating time:
import time
time_start = time.clock()
#run your code
time_elapsed = (time.clock() - time_start)
As referenced by the Python documentation:
time.clock()
On Unix, return the current processor time as a floating
point number expressed in seconds. The precision, and in fact the very
definition of the meaning of “processor time”, depends on that of the
C function of the same name, but in any case, this is the function to
use for benchmarking Python or timing algorithms.
On Windows, this function returns wall-clock seconds elapsed since the
first call to this function, as a floating point number, based on the
Win32 function QueryPerformanceCounter(). The resolution is typically
better than one microsecond.
Reference: http://docs.python.org/library/time.html
Use this for calculating memory:
import resource
resource.getrusage(resource.RUSAGE_SELF).ru_maxrss
Reference: http://docs.python.org/library/resource.html
Based on #Daniel Li's answer for cut&paste convenience and Python 3.x compatibility:
import time
import resource
time_start = time.perf_counter()
# insert code here ...
time_elapsed = (time.perf_counter() - time_start)
memMb=resource.getrusage(resource.RUSAGE_SELF).ru_maxrss/1024.0/1024.0
print ("%5.1f secs %5.1f MByte" % (time_elapsed,memMb))
Example:
2.3 secs 140.8 MByte
There is a really good library called jackedCodeTimerPy for timing your code. You should then use resource package that Daniel Li suggested.
jackedCodeTimerPy gives really good reports like
label min max mean total run count
------- ----------- ----------- ----------- ----------- -----------
imports 0.00283813 0.00283813 0.00283813 0.00283813 1
loop 5.96046e-06 1.50204e-05 6.71864e-06 0.000335932 50
I like how it gives you statistics on it and the number of times the timer is run.
It's simple to use. If i want to measure the time code takes in a for loop i just do the following:
from jackedCodeTimerPY import JackedTiming
JTimer = JackedTiming()
for i in range(50):
JTimer.start('loop') # 'loop' is the name of the timer
doSomethingHere = 'This is really useful!'
JTimer.stop('loop')
print(JTimer.report()) # prints the timing report
You can can also have multiple timers running at the same time.
JTimer.start('first timer')
JTimer.start('second timer')
do_something = 'amazing'
JTimer.stop('first timer')
do_something = 'else'
JTimer.stop('second timer')
print(JTimer.report()) # prints the timing report
There are more use example in the repo. Hope this helps.
https://github.com/BebeSparkelSparkel/jackedCodeTimerPY
Use a memory profiler like guppy
>>> from guppy import hpy; h=hpy()
>>> h.heap()
Partition of a set of 48477 objects. Total size = 3265516 bytes.
Index Count % Size % Cumulative % Kind (class / dict of class)
0 25773 53 1612820 49 1612820 49 str
1 11699 24 483960 15 2096780 64 tuple
2 174 0 241584 7 2338364 72 dict of module
3 3478 7 222592 7 2560956 78 types.CodeType
4 3296 7 184576 6 2745532 84 function
5 401 1 175112 5 2920644 89 dict of class
6 108 0 81888 3 3002532 92 dict (no owner)
7 114 0 79632 2 3082164 94 dict of type
8 117 0 51336 2 3133500 96 type
9 667 1 24012 1 3157512 97 __builtin__.wrapper_descriptor
<76 more rows. Type e.g. '_.more' to view.>
>>> h.iso(1,[],{})
Partition of a set of 3 objects. Total size = 176 bytes.
Index Count % Size % Cumulative % Kind (class / dict of class)
0 1 33 136 77 136 77 dict (no owner)
1 1 33 28 16 164 93 list
2 1 33 12 7 176 100 int
>>> x=[]
>>> h.iso(x).sp
0: h.Root.i0_modules['__main__'].__dict__['x']