I want to check memory consumption of my python code and have therefore added the following rows in the code:
import resource
print(resource.getrusage(resource.RUSAGE_SELF).ru_maxrss)
As an alternative I have also tried this:
import psutil
process = psutil.Process(os.getpid())
print(process.memory_info().rss) # in bytes
However, I get different results, as for example 866 480 from resource and 730 689 536 from psutil. Of course as you can see, in the first case it is kilobytes and in the second case bytes, but it is a difference also in addition to that.
Reading the documentation, I still don't understand causes the difference, so input would be valuable.
TLDR: resource.getrusage sometimes misses that Python already removed objects from memory
There was a bug in memory profiler (which was using resource.getrusage at that time). In this blog post the different methods for memory measurements are described. I cite:
"this approach [resource.getrusage] is several times faster than the one based in psutil [...] The problem with this approach is that it seems to report results that are slightly different in some cases. Notably it seems to differ when objects have been recently liberated from the python interpreter. In the following example, orphaned arrays are liberated by the python interpreter, which is correctly seen by psutil but not by resource..."
Related
Recently I started having some problems with Django (3.1) tests, which I finally tracked down to some kind of memory leak.
I normally run my suite (roughly 4000 tests at the moment) with --parallel=4 which results in a high memory watermark of roughly 3GB (starting from 500MB or so).
For auditing purposes, though, I occasionally run it with --parallel=1 - when I do this, the memory usage keeps increasing, ending up over the VM's allocated 6GB.
I spent some time looking at the data and it became clear that the culprit is, somehow, Webtest - more specifically, its response.html and response.forms: each call during the test case might allocate a few MBs (two or three, generally) which don't get released at the end of the test method and, more importantly, not even at the end of the TestCase.
I've tried everything I could think of - gc.collect() with gc.DEBUG_LEAK shows me a whole lot of collectable items, but it frees no memory at all; using delattr() on various TestCase and TestResponse attributes and so on resulted in no change at all, etc.
I'm quite literally at my wits' end, so any pointer to solve this (beside editing the thousand or so tests which use WebTest responses, which is really not feasible) would be very much appreciated.
(please note that I also tried using guppy and tracemalloc and memory_profiler but neither gave me any kind of actionable information.)
Update
I found that one of our EC2 testing instances isn't affected by the problem, so I spent some more time trying to figure this out.
Initially, I tried to find the "sensible" potential causes - for instance, the cached template loader, which was enabled on my local VM and disabled on the EC2 instance - without success.
Then I went all in: I replicated the EC2 virtualenv (with pip freeze) and the settings (copying the dotenv), and checked out the same commit where the tests were running normally on the EC2.
Et voilà! THE MEMORY LEAK IS STILL THERE!
Now, I'm officially giving up and will use --parallel=2 for future tests until some absolute guru can point me in the right directions.
Second update
And now the memory leak is there even with --parallel=2. I guess that's somehow better, since it looks increasingly like it's a system problem rather than an application problem. Doesn't solve it but at least I know it's not my fault.
Third update
Thanks to Tim Boddy's reply to this question I tried using chap to figure out what's making memory grow. Unfortunately I can't "read" the results properly but it looks like some non-python library is actually causing the problem.
So, this is what I've seen analyzing the core after a few minutes running the tests that I know cause the leak:
chap> summarize writable
49 ranges take 0x1e0aa000 bytes for use: unknown
1188 ranges take 0x12900000 bytes for use: python arena
1 ranges take 0x4d1c000 bytes for use: libc malloc main arena pages
7 ranges take 0x3021000 bytes for use: stack
139 ranges take 0x476000 bytes for use: used by module
1384 writable ranges use 0x38b5d000 (951,439,360) bytes.
chap> count used
3144197 allocations use 0x14191ac8 (337,189,576) bytes.
The interesting point is that the non-leaking EC2 instance shows pretty much the same values as the one I get from count used - which would suggest that those "unknown" ranges are the actual hogs.
This is also supported by the output of summarize used (showing first few lines):
Unrecognized allocations have 886033 instances taking 0x8b9ea38(146,401,848) bytes.
Unrecognized allocations of size 0x130 have 148679 instances taking 0x2b1ac50(45,198,416) bytes.
Unrecognized allocations of size 0x40 have 312166 instances taking 0x130d980(19,978,624) bytes.
Unrecognized allocations of size 0xb0 have 73886 instances taking 0xc66ca0(13,003,936) bytes.
Unrecognized allocations of size 0x8a8 have 3584 instances taking 0x793000(7,942,144) bytes.
Unrecognized allocations of size 0x30 have 149149 instances taking 0x6d3d70(7,159,152) bytes.
Unrecognized allocations of size 0x248 have 10137 instances taking 0x5a5508(5,920,008) bytes.
Unrecognized allocations of size 0x500018 have 1 instances taking 0x500018(5,242,904) bytes.
Unrecognized allocations of size 0x50 have 44213 instances taking 0x35f890(3,537,040) bytes.
Unrecognized allocations of size 0x458 have 2969 instances taking 0x326098(3,301,528) bytes.
Unrecognized allocations of size 0x205968 have 1 instances taking 0x205968(2,120,040) bytes.
The size of those single-instance allocations is very similar to the kind of deltas I see if I add calls to resource.getrusage(resource.RUSAGE_SELF).ru_maxrss in my test runner when starting/stopping tests - but they're not recognized as Python allocations, hence my feeling.
First of all, a huge apology: I was mistaken in thinking WebTest was the cause of this, and the reason was indeed in my own code, rather than libraries or anything else.
The real cause was a mixin class where I, unthinkingly, added a dict as class attribute, like
class MyMixin:
errors = dict()
Since this mixin is used in a few forms, and the tests generate a fair amout of form errors (that are added to the dict), this ended up hogging memory.
While this is not very interesting in itself, there are a few takeaways that may be helpful to future explorers who stumble across the same kind of problem. They might all be obvious to everybody except me and a single other developer - in which case, hello other developer.
The reason why the same commit had different behaviors on the EC2 machine and my own VM is that the branch in the remote machine hadn't been merged yet, so the commit that introduced the leak wasn't there poisoning the environment.
The takeaway here is: make sure the code you're testing is the same, not just the commit.
Low-level memory analysis might help in some cases but it's not a skill you pick up in half a day: I spent a long time trying to make sense of allocations and objects and whatever without getting any closer to the solution.
This kind of mistake can be incredibly costly - if I had a few hundred fewer tests, I wouldn't have ended up with an OOM error, and I probably wouldn't have noticed the problem at all. Until it was in production, that is.
That could be fixed with some kind of linter/static analysis too, if there were one which flags this kind of construction as potentially harmful. Unfortunately, there isn't one (that I could find).
git bisect is your friend, as long as you can find a commit that actually works.
this question was already asked a few times and I already tried some methods. Unfortunately, somehow I can't find out why my python process uses so much memory.
My setup: python 3.5.2, Windows 10, and a lot of third-party packages.
The true memory usage for the process is 300 MB ( way too much but sometimes it even explodes to 32gb)
process = psutil.Process(os.getpid())
memory_real = process.memory_info().rss/(1024*1024) #--> 300 Mb
What I tried so far:
memory line profiler (didn't helped me)
tracemalloc.start(50) and then
snapshot = tracemalloc.take_snapshot()
top_stats = snapshot.statistics('lineno')
print("[ Top 10 ]")
for stat in top_stats[:10]:
log_and_print(stat)
gives just few Mb's as result
gc.collect()
import objgraph
objgraph.show_most_common_types()
returns:
function 51791
dict 32939
tuple 28825
list 13823
set 10748
weakref 10551
cell 7870
getset_descriptor 6276
type 6088
OrderedDict 5083
(when the process had 200 mb's the numbers above were even higher)
pympler: process exists with some error-code
So really struggling to find a way where the memory of the process is allocated. Do I do something wrong, or is there some easy way to find out what is going on?
PS:
I was able to solve this problem through luck. It was a badly coded while loop, where a list was extended, without a proper break condition.
Anyway is there a way to find such memory leaks. What I often see is that some memory profiling packages is called explicitly. In this case, I wouldn't have a chance to make a memory dump or check the memory in the main thread since the loop is never left.
I'm doing some extensive scientific python calculations and whant to know execution time and memory footprint of python script.
So how to get peak memory usage of python script?
If it matters I'm on Windows and use python 2.7.
Sounds like you are looking for a memory profiler.
Memory_profiler is one that you can dive into which line is giving you the problems and with some querying you can figure out which area is the biggest in memory consumption.
https://pypi.python.org/pypi/memory_profiler
and since you are using windows it will also need this https://pypi.python.org/pypi/psutil
Good Luck!
The resource module can give you this. Works in both Python 2 and Python 3.
import resource
resource.getrusage(resource.RUSAGE_SELF).ru_maxrss
This is peak memory in kilobytes. The user and system time is also included in the value from getrusage.
For the peak memory, as you are on Windows, you can use psutil and psutil.Process.memory_info, for example to get the peak working set size, in bytes:
>>> import psutil
>>> p = psutil.Process()
>>> p.memory_info().peak_wset
238530560L
As per the link above, you can get more details about some Windows specific fields on this page.
I'm trying to identify a memory leak in a Python program I'm working on. I'm current'y running Python 2.7.4 on Mac OS 64bit. I installed heapy to hunt down the problem.
The program involves creating, storing, and reading large database using the shelve module. I am not using the writeback option, which I know can create memory problems.
Heapy usage shows during the program execution, the memory is roughly constant. Yet, my activity monitor shows rapidly increasing memory. Within 15 minutes, the process has consumed all my system memory (16gb), and I start seeing page outs. Any idea why heapy isn't tracking this properly?
Take a look at this fine article. You are, most likely, not seeing memory leaks but memory fragmentation. The best workaround I have found is to identify what the output of your large working set operation actually is, load the large dataset in a new process, calculate the output, and then return that output to the original process.
This answer has some great insight and an example, as well. I don't see anything in your question that seems like it would preclude the use of PyPy.
When I load the file into json, pythons memory usage spikes to about 1.8GB and I can't seem to get that memory to be released. I put together a test case that's very simple:
with open("test_file.json", 'r') as f:
j = json.load(f)
I'm sorry that I can't provide a sample json file, my test file has a lot of sensitive information, but for context, I'm dealing with a file in the order of 240MB. After running the above 2 lines I have the previously mentioned 1.8GB of memory in use. If I then do del j memory usage doesn't drop at all. If I follow that with a gc.collect() it still doesn't drop. I even tried unloading the json module and running another gc.collect.
I'm trying to run some memory profiling but heapy has been churning 100% CPU for about an hour now and has yet to produce any output.
Does anyone have any ideas? I've also tried the above using cjson rather than the packaged json module. cjson used about 30% less memory but otherwise displayed exactly the same issues.
I'm running Python 2.7.2 on Ubuntu server 11.10.
I'm happy to load up any memory profiler and see if it does better then heapy and provide any diagnostics you might think are necessary. I'm hunting around for a large test json file that I can provide for anyone else to give it a go.
I think these two links address some interesting points about this not necessarily being a json issue, but rather just a "large object" issue and how memory works with python vs the operating system
See Why doesn't Python release the memory when I delete a large object? for why memory released from python is not necessarily reflected by the operating system:
If you create a large object and delete it again, Python has probably released the memory, but the memory allocators involved don’t necessarily return the memory to the operating system, so it may look as if the Python process uses a lot more virtual memory than it actually uses.
About running large object processes in a subprocess to let the OS deal with cleaning up:
The only really reliable way to ensure that a large but temporary use of memory DOES return all resources to the system when it's done, is to have that use happen in a subprocess, which does the memory-hungry work then terminates. Under such conditions, the operating system WILL do its job, and gladly recycle all the resources the subprocess may have gobbled up. Fortunately, the multiprocessing module makes this kind of operation (which used to be rather a pain) not too bad in modern versions of Python.