I just ran a python program in the Mac OS Terminal, and there is unusual memory leak.
The program is simple like this:
for i in xrange(1000000000, 2000000000, 10):
i2 = i * i
print i, i2, str(i2)[::2]
if str(i2)[::2] == '1234567890':
break
When the program is running, it consumes more and more memory till it use up all my memory.
When I terminate the program, my Terminal.app still consumes a lot of memory, so I guess it's a bug in Terminal.app?
Does anyone have similar experience?
This isn't a bug; it's actually a feature. Terminal.app, like many other terminal emulators, saves recent output in a buffer so that you can scroll back (with page up or the scroll bar). You can limit how large this is by going to Terminal -> Preferences -> Settings and setting the scrollback limit to something other than Unlimited.
It's not Python that is leaking memory. Look closer. On my machine, the Python process remains at a quiet, stable 3.5 MB of memory.
The memory usage increment you see is most likely due to the Terminal not ever discarding output. You can alter this behavior going to Preferences, Settings, and setting the maximum line number to something else than "Unlimited".
Related
I have a massive Python script I inherited. It runs continuously on a long list of files, opens them, does some processing, creates plots, writes some variables to a new text file, then loops back over the same files (or waits for new files to be added to the list).
My memory usage steadily goes up to the point where my RAM is full within an hour or so. The code is designed to run 24/7/365 and apparently used to work just fine. I see the RAM usage steadily going up in task manager. When I interrupt the code, the RAM stays used until I restart the Python kernel.
I have used sys.getsizeof() to check all my variables and none are unusually large/increasing with time. This is odd - where is the RAM going then? The text files I am writing to? I have checked and as far as I can tell every file creation ends with a f.close() statement, closing the file. Similar for my plots that I create (I think).
What else would be steadily eating away at my RAM? Any tips or solutions?
What I'd like to do is some sort of "close all open files/figures" command at some point in my code. I am aware of the del command but then I'd have to list hundreds of variables at multiple points in my code to routinely delete them (plus, as I pointed out, I already checked getsizeof and none of the variables are large. Largest was 9433 bytes).
Thanks for your help!
I have written a program that expands a database of prime numbers. This program is written in python and runs on windows 10 (x64) with 8GB RAM.
The program stores all primes it has found in a list of integers for further calculations and uses approximately 6-7GB of RAM while running. During some runs however, this figure has dropped to below 100MB. The memory usage then stays low for the duration of the run, though increasing as expected as more numbers are added to the prime array. Note that not all runs result in a memory drop.
Memory usage measured with task manager
These, seemingly random, drops has led me the following theories:
There's a bug in my code, making it drop critical data and messing up the results (most likely but not supported by the results)
Python just happens to optimize my code extremely well after a while.
Python or Windows is compensating for my over-usage of the RAM by cleaning out portions of my prime-number array that aren't used that much. (eventually resulting in incorrect calculations)
Python or Windows is compensating for my over-usage of the RAM by allocating disk space instead of ram.
Questions
What could be the reason(s) for this memory drop?
How does python handle programs that use more than available RAM?
How does Windows handle programs that use more than available RAM?
1, 2, and 3 are incorrect theories.
4 is correct. Windows (not Python) is moving some of your process memory to swap space. This is almost totally transparent to your application - you don't need to do anything special to respond to or handle this situation. The only thing you will notice is your application may get slower as information is written to and read from disk. But it all happens transparently. See https://en.wikipedia.org/wiki/Virtual_memory for more information.
Have you heard of paging? Windows dumps some ram (that hasn't been used in a while) to your hard drive to keep your computer from running out or ram and ultimately crashing.
Only Windows deals with memory management. Although, if you use Windows 10, it will also compress your memory, somewhat like a zip file.
I'm working on a Python program which sometimes fills up a list with millions of items. The computer (Ubuntu) starts swapping and the debugger (Eclipse) becomes unresponsive.
Is it possible to add a line in the cycle that checks how much memory is being used, and interrupts the execution, so I can check what's going on?
I'm thinking about something like:
if usedmemory() > 1000000000:
pass # with a breakpoint here
but I don't know what used memory() could be.
This is highly dependant on the machine you're running Python on. Here's a SO answer for a way to do it on Linux https://stackoverflow.com/a/278271/541208, but the other answer there offers a more platform independant solution: https://stackoverflow.com/a/2468983/541208: The psutil library, which you can install via pip install psutil:
>>> psutil.virtual_memory()
vmem(total=8374149120L, available=2081050624L, percent=75.1, used=8074080256L, free=300068864L, active=3294920704, inactive=1361616896, buffers=529895424L, cached=1251086336)
>>> psutil.swap_memory()
swap(total=2097147904L, used=296128512L, free=1801019392L, percent=14.1, sin=304193536, sout=677842944)
So you'd look at the percent of the available memory and kill your process depending on how much memory it has been using
I'm trying to identify a memory leak in a Python program I'm working on. I'm current'y running Python 2.7.4 on Mac OS 64bit. I installed heapy to hunt down the problem.
The program involves creating, storing, and reading large database using the shelve module. I am not using the writeback option, which I know can create memory problems.
Heapy usage shows during the program execution, the memory is roughly constant. Yet, my activity monitor shows rapidly increasing memory. Within 15 minutes, the process has consumed all my system memory (16gb), and I start seeing page outs. Any idea why heapy isn't tracking this properly?
Take a look at this fine article. You are, most likely, not seeing memory leaks but memory fragmentation. The best workaround I have found is to identify what the output of your large working set operation actually is, load the large dataset in a new process, calculate the output, and then return that output to the original process.
This answer has some great insight and an example, as well. I don't see anything in your question that seems like it would preclude the use of PyPy.
I have a relatively simple (no classes) python 2.7 program. The first thing the program does is read an sqlite dbase into a dictionary. The database is large, but not huge, around 90Meg on disk. It takes about 20 seconds to read in. After reading in the database I initialize some variables, e.g.
localMax = 0
localMin = 0
firstTime = True
When I debug this program in Eclipse-3.7.0/pydev - even these simple lines - each single-step in the debugger eats up 100% of a core, and takes between 5 and 10 seconds. I can see the python process goes to 100% cpu for 10 seconds. Single-step... wait 10 seconds... single-step... wait 10 seconds... If I debug at the command line just using pdb, no problems. If I'm not debugging at all, the program runs at "normal" speed, nothing strange like in Eclipse.
I've reproduced this on a dual core Win7 PC w/ 4G memory, my 8 core Ubuntu box w/ 8G of memory, and even my Mac Air. How's that for multi-platform development! I kept thinking it would work somewhere. I'm never even close to running out of memory at any time.
On each Eclipse single-step, why does the python process jump to 100% CPU, and take 10 seconds?
Here is a good enough workaround, based on Mikko Ohtamaa's hint. I just verified the following on my Mac Air:
If I simply close the 'Variables' window in the Eclipse GUI, I can single step through the code at normal speed. Which is great, but, uh, I don't have the Variables window.
For any variable I want to see, I can hover my cursor over the variable and see the value. I didn't attempt to hover over my large dictionary that is the culprit here.
I can also right-click on any variable and add a 'Watch', which brings up an 'Expressions' window. In this case the variable is just a degenerate case (very simple case) of an 'expression.
So, the workaround for me is to close the Eclipse Variable window, and use the Expressions window to selectively view variables. A pain, but for the debugging I'm doing it is better than pdb.
I simply commented this line out:
np.set_printoptions(threshold = 'nan')
It seems eclipse is trying to keep up with too much information.