What is the "right" or "best" way to monitor the system resources a python script is using and terminate it if the resource use exceeds some predetermined values. In my case memory usage is of concern. I am not asking how to measure the system resource use although I am open to suggestions.
As a simple example, let's assume I have a function that finds prime numbers less than some large number and adds them to a list based on some condition. I don't know ahead of time how many prime numbers will satisfy the condition so I what to be sure to terminate the function if I use up to much system memory (8gb lets say).
I know that there are ways to monitor the size of python objects. What I don't know is the proper way to monitor the size of the list and exit is to just include a size test in the prime function loop and exit if it exceeds 8gb or if there is an "external" (by external I mean external to the loop but still within or part of the python script) way to monitor and exit.
In my case I am running on a mac but am asking the question in general.
On Unix-like system, a useful "external" way to monitor any process is the ulimit command (you don't clarify whether you want instead to run in Windows, where ulimit doesn't exist and other approaches may, but I don't know them;-).
If you're thinking about performing such controls inside your own Python programs, just change the function in question to check the size of each object it's appending to the list (and keep a running total) and return when the running total reaches or exceeds a threshold (which you could pass as an extra parameter to the function in question).
Edit: the OP has clarified in a comment that they want the monitoring in the very worst place it could possibly be placed -- in the previous paragraphs, I mentioned how it's easy outside of the process, easy inside the function, but the OP wants it "smack in the middle";-).
Least-bad way is probably with a "watchdog thread" -- a separate daemon thread in an infinite loop which, every X seconds, checks the process's resource consumption (e.g. with resource.getrusage, if on Unix-like machines -- again, if on Windows, something else is needed instead) and, if that consumption exceeds the desired limits, attempts to kill the main thread with thread.interrupt_main. Of course, this is fail from foolproof: the periodicity X (like in all cases of "polling") must be low enough to stop a runaway process in the meantime, but high enough to not slow the process down to a crawl. Plus, the main thread (the only one that can be interrupted like this) might be blocking exceptions (in which case the watchdog thread might perhaps try with "signals to this very process" of growing severity, all the way up to SIGKILL, the killer-signal that can never be blocked or intercepted).
So, this intermediate approach is a lot more work than the ulimit command, is more fragile, and has no substantial added value. But, if you want to put the monitoring "inside the process but outside the resource-consuming function", with no advantages, lots of work, and the other disadvantages I've mentioned, this is the way to do it.
resource.getrusage() (in particular ru_idrss) can give you the resource usage of the current python interpreter, which you can use as a sentinel to stop processing.
Related
I have written a python program that needs to run for multiple days at a time, because of the constant collection of data. Previously I had no issues running this program for months at a time. I recently made some updates to the program, and now after around 12 hours I get the dreaded out of memory killer. The 'dmesg' output is the following:
[9084334.914808] Out of memory: Kill process 2276 (python2.7) score 698 or sacrifice child
[9084334.914811] Killed process 2276 (python2.7) total-vm:13279000kB, anon-rss:4838164kB, file-rss:8kB
Besides just general python coding, the main change made to the program was the addition of a multiprocessing Queue. This is the first time I have ever used this feature, so I am not sure if this might be the cause of the issue. The purpose of the Queue in my program is to be able to make dynamic changes in a parallel process. The Queue is initiated in the main program and in continually being monitored in the parallel process. A simplified version of how I am doing this in the parallel process is the following (with 'q' being the Queue):
while(1):
if q.empty():
None
else:
fr = q.get()
# Additional code
time.sleep(1)
The dynamic changes to 'q' do not happen very often so majority of the time q.empty() will be true, but the loop is there to be ready as soon as changes are made. My question is, would running this code for multiple hours at a time cause the memory to eventually run low? With the 'while' loop being pretty short and running basically non stop, I was thinking this might be an problem. If this could be the cause of the problem, does anybody have any suggestions on how to improve the code so the out of memory killer doesn't get called?
Thank you very much.
The only way you can run out of memory in the way you describe is if you're using more and more memory as time goes on. The loop here does not demonstrate this behavior, so it cannot be (solely) responsible for any memory errors. Running a tight, infinite loop can burn through a lot of needless processor cycles, but it can't cause a MemoryError by itself unless it's storing data to someplace else.
It's likely that elsewhere in your code, you're holding onto some variables that you don't intend to. This is called a memory leak, and you can use a memory profiler to look for where such a leak is coming from.
Some likely suspects are caching methods used to improve performance, or lists of variables that never leave scope. Perhaps your multiprocessing queue is holding on to references to earlier data objects, or items are never deleted from the queue once they're inserted? (This latter case is unlikely given the code you've shown if you're using the builtin queue.Queue, but anything is possible).
You can convert your program into a linux service and set its oom policy to continue.
You can check this and this links to see how to see/edit service parameters and see oom policy service parameter respectively.
My wx GUI shows thumbnails, but they're slow to generate, so:
The program should remain usable while the thumbnails are generating.
Switching to a new folder should stop generating thumbnails for the old folder.
If possible, thumbnail generation should make use of multiple processors.
What is the best way to do this?
Putting the thumbnail generation in a background thread with threading.Thread will solve your first problem, making the program usable.
If you want a way to interrupt it, the usual way is to add a "stop" variable which the background thread checks every so often (e.g., once per thumbnail), and the GUI thread sets when it wants to stop it. Ideally you should protect this with a threading.Condition. (The condition isn't actually necessary in most cases—the same GIL that prevents your code from parallelizing well also protects you from certain kinds of race conditions. But you shouldn't rely on that.)
For the third problem, the first question is: Is thumbnail generation actually CPU-bound? If you're spending more time reading and writing images from disk, it probably isn't, so there's no point trying to parallelize it. But, let's assume that it is.
First, if you have N cores, you want a pool of N threads, or N-1 if the main thread has a lot of work to do too, or maybe something like 2N or 2N-1 to trade off a bit of best-case performance for a bit of worst-case performance.
However, if that CPU work is done in Python, or in a C extension that nevertheless holds the Python GIL, this won't help, because most of the time, only one of those threads will actually be running.
One solution to this is to switch from threads to processes, ideally using the standard multiprocessing module. It has built-in APIs to create a pool of processes, and to submit jobs to the pool with simple load-balancing.
The problem with using processes is that you no longer get automatic sharing of data, so that "stop flag" won't work. You need to explicitly create a flag in shared memory, or use a pipe or some other mechanism for communication instead. The multiprocessing docs explain the various ways to do this.
You can actually just kill the subprocesses. However, you may not want to do this. First, unless you've written your code carefully, it may leave your thumbnail cache in an inconsistent state that will confuse the rest of your code. Also, if you want this to be efficient on Windows, creating the subprocesses takes some time (not as in "30 minutes" or anything, but enough to affect the perceived responsiveness of your code if you recreate the pool every time a user clicks a new folder), so you probably want to create the pool before you need it, and keep it for the entire life of the program.
Other than that, all you have to get right is the job size. Hopefully creating one thumbnail isn't too big of a job—but if it's too small of a job, you can batch multiple thumbnails up into a single job—or, more simply, look at the multiprocessing API and change the way it batches jobs when load-balancing.
Meanwhile, if you go with a pool solution (whether threads or processes), if your jobs are small enough, you may not really need to cancel. Just drain the job queue—each worker will finish whichever job it's working on now, but then sleep until you feed in more jobs. Remember to also drain the queue (and then maybe join the pool) when it's time to quit.
One last thing to keep in mind is that if you successfully generate thumbnails as fast as your computer is capable of generating them, you may actually cause the whole computer—and therefore your GUI—to become sluggish and unresponsive. This usually comes up when your code is actually I/O bound and you're using most of the disk bandwidth, or when you use lots of memory and trigger swap thrash, but if your code really is CPU-bound, and you're having problems because you're using all the CPU, you may want to either use 1 fewer core, or look into setting thread/process priorities.
What is the best way to continuously repeat the execution of a given function at a fixed interval while being able to terminate the executor (thread or process) immediately?
Basically I know two approaches:
use multiprocessing and function with infinite cycle and time.sleep at the end. Processing is terminated with process.terminate() in any state.
use threading and constantly recreate timers at the end of the thread function. Processing is terminated by timer.cancel() while sleeping.
(both “in any state” and “while sleeping” are fine, even though the latter may be not immediate). The problem is that I have to use both multiprocessing and threading as the latter appears not to work on ARM (some fuzzy interaction of python interpreter and vim, outside of vim everything is fine) (I was using the second approach there, have not tried threading+cycle; no code is currently left) and the former spawns way too many processes which I would like not to see unless really required. This leads to a problem of having to code two different approaches while threading with cycle is just a few more imports for drop-in replacements of all multiprocessing stuff wrapped in if/else (except that there is no thread.terminate()). Is there some better way to do the job?
Currently used code is here (currently with cycle for both jobs), but I do not think it will be much useful to answer the question.
Update: The reason why I am using this solution are functions that display file status (and some other things like branch) in version control systems in vim statusline. These statuses must be updated, but updating them immediately cannot be done without using hooks and I have no idea how to set hooks temporary and remove on vim quit without possibly spoiling user configuration. Thus standard solution is cache expiring after N seconds. But when cache expired I need to do an expensive shell call and the delay appears to be noticeable, the more noticeable the heavier IO load is. What I am implementing now is updating values for viewed buffers each N seconds in a separate process thus delays are bothering that process and not me. Threads are likely to also work because GIL does not affect calls to external programs.
I'm not clear on why a single long-lived thread that loops infinitely over the tasks wouldn't work for you? Or why you end up with many processes in the multiprocess option?
My immediate reaction would have been a single thread with a queue to feed it things to do. But I may be misunderstanding the problem.
I do not know how do it simply and/or cleanly in Python, but I was wondering if maybe you couldn't take avantage of an existing system scheduler, e.g. crontab for *nix system.
There is an API in python and it might satisfied your needs.
I am currently working in Python and my program looks like this:
function(1)
function(2)
...
function(100)
Performing a function takes ~30 minutes at 100% CPU, so executing the program takes a lot of time. The functions access the same file for inputs, do a lot of math and print the results.
Would introducing multithreading decrease the time, which the program takes to complete (I am working on a multicore machine)? If so, how many threads should I use?
Thank you!
It depends.
If none of the functions depend on each other at all, you can of course run them on separate threads (or even processes using multiprocessing, to avoid the global interpreter lock). You can either run one process per core, or run 100 processes, or any number in between, depending on the resource constraints of your system. (If you don't own the system, some admins don't like users who spam the process table.)
If the functions must be run one after the other, then you can't do that. You have to restructure the program to try and isolate independent tasks, or accept that you might have a P-complete (inherently hard to parallelize) problem and move on.
I am developing some Python code for Windows. A criteria is that it will use less than 1% of CPU. I understand that it is impossible to guarantee this all the time due to things like garbage collection, but what would be the best practice to get as close as possible. My current solution is to spread a lot of time.sleep(0.1) around the code, especially in loops. There are, however, obvious problems with this approach.
What other approaches could be taken?
I should also mention that the application has lots of threads in it using the threading library.
EDIT: Setting the process priority is not what I am after.
It is the job of the operating system to schedule CPU time. Use your operating system's built-in process-limits mechanisms (hopefully they exist on Windows) to restrict your process to <1% CPU.
This style of sprinkling unnecessary sleeps every few lines in the code will make the code terrible to create and extend and maintain, not to mention incredibly inelegant. (Rate-limiting yourself may be useful in very small, limited, critical sections -- for example your program is queuing lots of IO requests and you don't wish to inundate the operating system, you might wish to put a single sleep-until-[condition] in each critical loop which has the potential to inundate the system, but otherwise use extremely sparingly.)
Ideally you would call an API to the appropriate OS mechanisms from within your program when you start up, telling the OS to throttle you appropriately.
If the goal is to not bother the user then "below 1% CPU" is the wrong approach. What you really want is "don't take time away from other processes but still complete as fast as possible" - that's what "below normal" process priority is for. See http://code.activestate.com/recipes/496767-set-process-priority-in-windows/ for an example of how process priority can be changed for the current process (calling that function with default parameters will do).
For the sales pitch you can show the task manager while the computer is idle ("See? 99%, my application gets lots of work done") and then start some CPU-intensive application ("Almost all CPU time is spent in the application the user is working with, my application simply went into background").
If the box used for the demonstration is a Windows Server, it can use Windows System Resource Manager for restricting CPU usage below the desired threshold. Trying to force this behavior by code is impossible, unless a Windows API exposes this capability explicitly.