Memory leak in python-igraph with induced_subgraph function? - python

I'm working with igraph in Python and encountered a problem when calling induced_subgraph function a million times - the problem is, there is a consumption of memory.
import gc
import igraph
import resource
g = igraph.Graph(n=4, edges=[[0, 1], [0,2], [0, 3]])
mem = resource.getrusage(resource.RUSAGE_SELF).ru_maxrss
for i in range (1,1000001):
igraph_list_v = [0,1,2,3]
s = g.induced_subgraph(igraph_list_v)
del s
if i%100000 == 0:
gc.collect()
print(resource.getrusage(resource.RUSAGE_SELF).ru_maxrss - mem)
mem = resource.getrusage(resource.RUSAGE_SELF).ru_maxrss
Output:
0
4752
0
4752
0
4752
0
4752
0
4488
0
4752
0
4732
0
4752
0
4752
0
4752
Zeros are from gc.collect() function. Based on output, the induced_subgraph function consumes 4752 kilobytes every 100000 calls.
Printed memory is increasing with each print statement. This is a dummy case, but I call induced_subgraph function many million times and it breaks my script due to memory error. What is a possible workaround and what is actually reason for such behavior? Also keep in mind that I am on beginner level, so keep answer as simple as possible. Thank you for help!

Related

Python resource.getrusage says no new memory allocated

In the following, I try to measure the current process's memory usage with
resource.getrusage.
I allocate a large array (and can see the system memory increase with psutil), but getrusage reports that same rss value before and after allocating the array. What's up with that?
import psutil
import os
import resource
import numpy as np
def total_memory():
v = psutil.virtual_memory()
s = psutil.swap_memory()
return v.used + s.used
result = resource.getrusage(resource.RUSAGE_SELF)
print "resource START ", result.ru_maxrss, result.ru_ixrss, result.ru_idrss, result.ru_isrss
init = total_memory()
print "psutil START ", init
data = np.random.rand(10000,10000)
result = resource.getrusage(resource.RUSAGE_SELF)
print "resource END ", result.ru_maxrss, result.ru_ixrss, result.ru_idrss, result.ru_isrss
print "psutil END+", total_memory()-init
reports:
resource START 1586580 0 0 0
psutil START 5074391040
resource END 1586580 0 0 0 # same
psutil END+ 793374720 # total memory increase by this much
Python 2.7.6. Ubuntu 14

Python Memory Leak Using binascii, zlib, struct, and numpy

I have a python script which is processing a large amount of data from compressed ASCII. After a short period, it runs out of memory. I am not constructing large lists or dicts. The following code illustrates the issue:
import struct
import zlib
import binascii
import numpy as np
import psutil
import os
import gc
process = psutil.Process(os.getpid())
n = 1000000
compressed_data = binascii.b2a_base64(bytearray(zlib.compress(struct.pack('%dB' % n, *np.random.random(n))))).rstrip()
print 'Memory before entering the loop is %d MB' % (process.get_memory_info()[0] / float(2 ** 20))
for i in xrange(2):
print 'Memory before iteration %d is %d MB' % (i, process.get_memory_info()[0] / float(2 ** 20))
byte_array = zlib.decompress(binascii.a2b_base64(compressed_data))
a = np.array(struct.unpack('%dB' % (len(byte_array)), byte_array))
gc.collect()
gc.collect()
print 'Memory after last iteration is %d MB' % (process.get_memory_info()[0] / float(2 ** 20))
It prints:
Memory before entering the loop is 45 MB
Memory before iteration 0 is 45 MB
Memory before iteration 1 is 51 MB
Memory after last iteration is 51 MB
Between the first and second iteration, 6 MB of memory get created. If i run the loop more than two times, the memory usage stays at 51 MB. If I put the code to decompress into its own function and feed it the actual compressed data, the memory usage will continue to grow. I am using Python 2.7. Why is the memory increasing and how can it be corrected? Thank you.
Through comments, we figured out what was going on:
The main issue is that variables declared in a for loop are not destroyed once the loop ends. They remain accessible, pointing to the value they received in the last iteration:
>>> for i in range(5):
... a=i
...
>>> print a
4
So here's what's happening:
First iteration: The print is showing 45MB, which the memory before instantiating byte_array and a.
The code instantiates those two lengthy variables, making the memory go to 51MB
Second iteration: The two variables instantiated in the first run of the loop are still there.
In the middle of the second iteration, byte_array and a are overwritten by the new instantiation. The initial ones are destroyed, but substituted by equally lengthy variables.
The for loop ends, but byte_array and a are still accessible in the code, therefore, not destroyed by the second gc.collect() call.
Changing the code to:
for i in xrange(2):
[ . . . ]
byte_array = None
a = None
gc.collect()
made the memory resreved by byte_array and a unaccessible, and therefore, freed.
There's more on Python's garbage collection in this SO answer: https://stackoverflow.com/a/4484312/289011
Also, it may be worth looking at How do I determine the size of an object in Python?. This is tricky, though... if your object is a list pointing to other objects, what is the size? The sum of the pointers in the list? The sum of the size of the objects those pointers point to?

How to avoid high CPU usage with pysnmp

I am using pysnmp and have encountered high CPU usage. I know netsnmp is written in C and pysnmp in Python, so I would expect the CPU usage times to be about 20-100% higher because of that. Instead I am seeing 20 times higher CPU usage times.
Am I using pysnmp correctly or could I do something to make it use less resources?
Test case 1 - PySNMP:
from pysnmp.entity.rfc3413.oneliner import cmdgen
import config
import yappi
yappi.start()
cmdGen = cmdgen.CommandGenerator()
errorIndication, errorStatus, errorIndex, varBindTable = cmdGen.nextCmd(
cmdgen.CommunityData(config.COMMUNITY),
cmdgen.UdpTransportTarget((config.HOST, config.PORT)),
config.OID,
lexicographicMode=False,
ignoreNonIncreasingOid=True,
lookupValue=False, lookupNames=False
)
for varBindTableRow in varBindTable:
for name, val in varBindTableRow:
print('%s' % (val,))
yappi.get_func_stats().print_all()
Test case 2 - NetSNMP:
import argparse
import netsnmp
import config
import yappi
yappi.start()
oid = netsnmp.VarList(netsnmp.Varbind('.'+config.OID))
res = netsnmp.snmpwalk(oid, Version = 2, DestHost=config.HOST, Community=config.COMMUNITY)
print(res)
yappi.get_func_stats().print_all()
If someone wants to test for himself, both test cases need a small file with settings, config.py:
HOST = '192.168.1.111'
COMMUNITY = 'public'
PORT = 161
OID = '1.3.6.1.2.1.2.2.1.8'
I have compared the returned values and they are the same - so both examples function correctly. The difference is in timings:
PySNMP:
Clock type: cpu
Ordered by: totaltime, desc
name #n tsub ttot tavg
..dgen.py:408 CommandGenerator.nextCmd 1 0.000108 1.890072 1.890072
..:31 AsynsockDispatcher.runDispatcher 1 0.005068 1.718650 1.718650
..r/lib/python2.7/asyncore.py:125 poll 144 0.010087 1.707852 0.011860
/usr/lib/python2.7/asyncore.py:81 read 72 0.001191 1.665637 0.023134
..UdpSocketTransport.handle_read_event 72 0.001301 1.664446 0.023117
..py:75 UdpSocketTransport.handle_read 72 0.001888 1.663145 0.023099
..base.py:32 AsynsockDispatcher._cbFun 72 0.001766 1.658938 0.023041
..:55 SnmpEngine.__receiveMessageCbFun 72 0.002194 1.656747 0.023010
..4 MsgAndPduDispatcher.receiveMessage 72 0.008587 1.654553 0.022980
..eProcessingModel.prepareDataElements 72 0.014170 0.831581 0.011550
../ber/decoder.py:585 Decoder.__call__ 1224/216 0.111002 0.801783 0.000655
...py:312 SequenceDecoder.valueDecoder 288/144 0.034554 0.757069 0.002629
..tCommandGenerator.processResponsePdu 72 0.008425 0.730610 0.010147
..NextCommandGenerator._handleResponse 72 0.008692 0.712964 0.009902
...
NetSNMP:
Clock type: cpu
Ordered by: totaltime, desc
name #n tsub ttot tavg
..kages/netsnmp/client.py:227 snmpwalk 1 0.000076 0.103274 0.103274
..s/netsnmp/client.py:173 Session.walk 1 0.000024 0.077640 0.077640
..etsnmp/client.py:48 Varbind.__init__ 72 0.008860 0.035225 0.000489
..tsnmp/client.py:111 Session.__init__ 1 0.000055 0.025551 0.025551
...
So, netsnmp uses 0.103 s of CPU time and pysnmp uses 1.890 s of CPU time for the same operation. I find the results surprising... I have also tested the asynchronous mode, but the results were even a bit worse.
Am I doing something wrong (with pysnmp)?
UPDATE:
As per Ilya's suggestion, I have tryed using BULK instead of WALK. BULK is indeed much faster overall, but PySNMP still uses cca. 20x CPU time in comparison to netsnmp:
..dgen.py:496 CommandGenerator.bulkCmd 1 0.000105 0.726187 0.726187
Netsnmp:
..es/netsnmp/client.py:216 snmpgetbulk 1 0.000109 0.044421 0.044421
So the question still stands - can I make pySNMP less CPU intensive? Am I using it incorrectly?
Try using GETBULK instead of GETNEXT. With your code and Max-Repetitions=25 setting it gives 5x times performance improvement on my synthetic test.

heapy reports memory usage << top

NB: This is my first foray into memory profiling with Python, so perhaps I'm asking the wrong question here. Advice re improving the question appreciated.
I'm working on some code where I need to store a few million small strings in a set. This, according to top, is using ~3x the amount of memory reported by heapy. I'm not clear what all this extra memory is used for and how I can go about figuring out whether I can - and if so how to - reduce the footprint.
memtest.py:
from guppy import hpy
import gc
hp = hpy()
# do setup here - open files & init the class that holds the data
print 'gc', gc.collect()
hp.setrelheap()
raw_input('relheap set - enter to continue') # top shows 14MB resident for python
# load data from files into the class
print 'gc', gc.collect()
h = hp.heap()
print h
raw_input('enter to quit') # top shows 743MB resident for python
The output is:
$ python memtest.py
gc 5
relheap set - enter to continue
gc 2
Partition of a set of 3197065 objects. Total size = 263570944 bytes.
Index Count % Size % Cumulative % Kind (class / dict of class)
0 3197061 100 263570168 100 263570168 100 str
1 1 0 448 0 263570616 100 types.FrameType
2 1 0 280 0 263570896 100 dict (no owner)
3 1 0 24 0 263570920 100 float
4 1 0 24 0 263570944 100 int
So in summary, heapy shows 264MB while top shows 743MB. What's using the extra 500MB?
Update:
I'm running 64 bit python on Ubuntu 12.04 in VirtualBox in Windows 7.
I installed guppy as per the answer here:
sudo pip install https://guppy-pe.svn.sourceforge.net/svnroot/guppy-pe/trunk/guppy

Calculating computational time and memory for a code in python

Can some body help me as how to find how much time and how much memory does it take for a code in python?
Use this for calculating time:
import time
time_start = time.clock()
#run your code
time_elapsed = (time.clock() - time_start)
As referenced by the Python documentation:
time.clock()
On Unix, return the current processor time as a floating
point number expressed in seconds. The precision, and in fact the very
definition of the meaning of “processor time”, depends on that of the
C function of the same name, but in any case, this is the function to
use for benchmarking Python or timing algorithms.
On Windows, this function returns wall-clock seconds elapsed since the
first call to this function, as a floating point number, based on the
Win32 function QueryPerformanceCounter(). The resolution is typically
better than one microsecond.
Reference: http://docs.python.org/library/time.html
Use this for calculating memory:
import resource
resource.getrusage(resource.RUSAGE_SELF).ru_maxrss
Reference: http://docs.python.org/library/resource.html
Based on #Daniel Li's answer for cut&paste convenience and Python 3.x compatibility:
import time
import resource
time_start = time.perf_counter()
# insert code here ...
time_elapsed = (time.perf_counter() - time_start)
memMb=resource.getrusage(resource.RUSAGE_SELF).ru_maxrss/1024.0/1024.0
print ("%5.1f secs %5.1f MByte" % (time_elapsed,memMb))
Example:
2.3 secs 140.8 MByte
There is a really good library called jackedCodeTimerPy for timing your code. You should then use resource package that Daniel Li suggested.
jackedCodeTimerPy gives really good reports like
label min max mean total run count
------- ----------- ----------- ----------- ----------- -----------
imports 0.00283813 0.00283813 0.00283813 0.00283813 1
loop 5.96046e-06 1.50204e-05 6.71864e-06 0.000335932 50
I like how it gives you statistics on it and the number of times the timer is run.
It's simple to use. If i want to measure the time code takes in a for loop i just do the following:
from jackedCodeTimerPY import JackedTiming
JTimer = JackedTiming()
for i in range(50):
JTimer.start('loop') # 'loop' is the name of the timer
doSomethingHere = 'This is really useful!'
JTimer.stop('loop')
print(JTimer.report()) # prints the timing report
You can can also have multiple timers running at the same time.
JTimer.start('first timer')
JTimer.start('second timer')
do_something = 'amazing'
JTimer.stop('first timer')
do_something = 'else'
JTimer.stop('second timer')
print(JTimer.report()) # prints the timing report
There are more use example in the repo. Hope this helps.
https://github.com/BebeSparkelSparkel/jackedCodeTimerPY
Use a memory profiler like guppy
>>> from guppy import hpy; h=hpy()
>>> h.heap()
Partition of a set of 48477 objects. Total size = 3265516 bytes.
Index Count % Size % Cumulative % Kind (class / dict of class)
0 25773 53 1612820 49 1612820 49 str
1 11699 24 483960 15 2096780 64 tuple
2 174 0 241584 7 2338364 72 dict of module
3 3478 7 222592 7 2560956 78 types.CodeType
4 3296 7 184576 6 2745532 84 function
5 401 1 175112 5 2920644 89 dict of class
6 108 0 81888 3 3002532 92 dict (no owner)
7 114 0 79632 2 3082164 94 dict of type
8 117 0 51336 2 3133500 96 type
9 667 1 24012 1 3157512 97 __builtin__.wrapper_descriptor
<76 more rows. Type e.g. '_.more' to view.>
>>> h.iso(1,[],{})
Partition of a set of 3 objects. Total size = 176 bytes.
Index Count % Size % Cumulative % Kind (class / dict of class)
0 1 33 136 77 136 77 dict (no owner)
1 1 33 28 16 164 93 list
2 1 33 12 7 176 100 int
>>> x=[]
>>> h.iso(x).sp
0: h.Root.i0_modules['__main__'].__dict__['x']

Categories

Resources