Task is:
I have task queue stored in db. It grows. I need to solve tasks by python script when I have resources for it. I see two ways:
python script working all the time. But i don't like it (reason posible memory leak).
python script called by cron and do a little part of task. But i need to solve the problem of one working active script in memory (To prevent active scripts count grow). What is the best solution to implement it in python?
Any ideas to solve this problem at all?
You can use a lockfile to prevent multiple scripts from running out of cron. See the answers to an earlier question, "Python: module for creating PID-based lockfile". This is really just good practice in general for anything that you need to make sure won't have multiple instances running, actually, so you should look into it even if you do have the script running constantly, which I do suggest.
For most things, it shouldn't be too hard to avoid memory leaks, but if you're having a lot of trouble with it (I sometimes do with complex third-party web frameworks, for example), I would suggest instead writing the script with a small, carefully-designed main loop that monitors the database for new jobs, and then uses the multiprocessing module to fork off new processes to complete each task.
When a task is complete, the child process can exit, immediately freeing any memory that isn't properly garbage collected, and the main loop should be simple enough that you can avoid any memory leaks.
This also offers the advantage that you can run multiple tasks in parallel if your system has more than one CPU core, or if your tasks spend a lot of time waiting for I/O.
This is a bit of a vague question. One thing you should remember is that it is very difficult to leak memory in Python, because of the automatic garbage collection. croning a Python script to handle the queue isn't very nice, although it would work fine.
I would use method 1; if you need more power you could make a small Python process that monitors the DB queue and starts new processes to handle the tasks.
I'd suggest using Celery, an asynchronous task queuing system which I use myself.
It may seem a bit heavy for your use case, but it makes it easy to expand later by adding more worker resources if/when needed.
Related
I have a rather unique problem at the moment: Several python processes on a single windows machine are spawned and are assigned jobs, which can take up to a few minutes to complete.
The job assigned to each process is unique - usually described through a uuid.
At any time, a user can issue a kill signal for a specific job - which then should be stopped, mostly to conserve a scarce resource which each job consumes.
I expect the kill signal to be quite rare, but each job will frequently ask "can I proceed with this id?".
Another solution of ours uses a mqsql table with ids to solve this problem, but this is quite slow and I would like to find a faster solution?
I know mmaps and could probably use them, but I have to take a lock while writing a stale id - does anybody know a locking mechanism I can use?
Most ideally I would like to avoid rolling my own solution, so is there some prexisting package? (redis etc. are out, mostly due to configuration issues on the machines).
My wx GUI shows thumbnails, but they're slow to generate, so:
The program should remain usable while the thumbnails are generating.
Switching to a new folder should stop generating thumbnails for the old folder.
If possible, thumbnail generation should make use of multiple processors.
What is the best way to do this?
Putting the thumbnail generation in a background thread with threading.Thread will solve your first problem, making the program usable.
If you want a way to interrupt it, the usual way is to add a "stop" variable which the background thread checks every so often (e.g., once per thumbnail), and the GUI thread sets when it wants to stop it. Ideally you should protect this with a threading.Condition. (The condition isn't actually necessary in most cases—the same GIL that prevents your code from parallelizing well also protects you from certain kinds of race conditions. But you shouldn't rely on that.)
For the third problem, the first question is: Is thumbnail generation actually CPU-bound? If you're spending more time reading and writing images from disk, it probably isn't, so there's no point trying to parallelize it. But, let's assume that it is.
First, if you have N cores, you want a pool of N threads, or N-1 if the main thread has a lot of work to do too, or maybe something like 2N or 2N-1 to trade off a bit of best-case performance for a bit of worst-case performance.
However, if that CPU work is done in Python, or in a C extension that nevertheless holds the Python GIL, this won't help, because most of the time, only one of those threads will actually be running.
One solution to this is to switch from threads to processes, ideally using the standard multiprocessing module. It has built-in APIs to create a pool of processes, and to submit jobs to the pool with simple load-balancing.
The problem with using processes is that you no longer get automatic sharing of data, so that "stop flag" won't work. You need to explicitly create a flag in shared memory, or use a pipe or some other mechanism for communication instead. The multiprocessing docs explain the various ways to do this.
You can actually just kill the subprocesses. However, you may not want to do this. First, unless you've written your code carefully, it may leave your thumbnail cache in an inconsistent state that will confuse the rest of your code. Also, if you want this to be efficient on Windows, creating the subprocesses takes some time (not as in "30 minutes" or anything, but enough to affect the perceived responsiveness of your code if you recreate the pool every time a user clicks a new folder), so you probably want to create the pool before you need it, and keep it for the entire life of the program.
Other than that, all you have to get right is the job size. Hopefully creating one thumbnail isn't too big of a job—but if it's too small of a job, you can batch multiple thumbnails up into a single job—or, more simply, look at the multiprocessing API and change the way it batches jobs when load-balancing.
Meanwhile, if you go with a pool solution (whether threads or processes), if your jobs are small enough, you may not really need to cancel. Just drain the job queue—each worker will finish whichever job it's working on now, but then sleep until you feed in more jobs. Remember to also drain the queue (and then maybe join the pool) when it's time to quit.
One last thing to keep in mind is that if you successfully generate thumbnails as fast as your computer is capable of generating them, you may actually cause the whole computer—and therefore your GUI—to become sluggish and unresponsive. This usually comes up when your code is actually I/O bound and you're using most of the disk bandwidth, or when you use lots of memory and trigger swap thrash, but if your code really is CPU-bound, and you're having problems because you're using all the CPU, you may want to either use 1 fewer core, or look into setting thread/process priorities.
What is the best way to continuously repeat the execution of a given function at a fixed interval while being able to terminate the executor (thread or process) immediately?
Basically I know two approaches:
use multiprocessing and function with infinite cycle and time.sleep at the end. Processing is terminated with process.terminate() in any state.
use threading and constantly recreate timers at the end of the thread function. Processing is terminated by timer.cancel() while sleeping.
(both “in any state” and “while sleeping” are fine, even though the latter may be not immediate). The problem is that I have to use both multiprocessing and threading as the latter appears not to work on ARM (some fuzzy interaction of python interpreter and vim, outside of vim everything is fine) (I was using the second approach there, have not tried threading+cycle; no code is currently left) and the former spawns way too many processes which I would like not to see unless really required. This leads to a problem of having to code two different approaches while threading with cycle is just a few more imports for drop-in replacements of all multiprocessing stuff wrapped in if/else (except that there is no thread.terminate()). Is there some better way to do the job?
Currently used code is here (currently with cycle for both jobs), but I do not think it will be much useful to answer the question.
Update: The reason why I am using this solution are functions that display file status (and some other things like branch) in version control systems in vim statusline. These statuses must be updated, but updating them immediately cannot be done without using hooks and I have no idea how to set hooks temporary and remove on vim quit without possibly spoiling user configuration. Thus standard solution is cache expiring after N seconds. But when cache expired I need to do an expensive shell call and the delay appears to be noticeable, the more noticeable the heavier IO load is. What I am implementing now is updating values for viewed buffers each N seconds in a separate process thus delays are bothering that process and not me. Threads are likely to also work because GIL does not affect calls to external programs.
I'm not clear on why a single long-lived thread that loops infinitely over the tasks wouldn't work for you? Or why you end up with many processes in the multiprocess option?
My immediate reaction would have been a single thread with a queue to feed it things to do. But I may be misunderstanding the problem.
I do not know how do it simply and/or cleanly in Python, but I was wondering if maybe you couldn't take avantage of an existing system scheduler, e.g. crontab for *nix system.
There is an API in python and it might satisfied your needs.
General tutorial or good resource on how to use threads in Python?
When to use threads, how they are effective, and some general background on threads [specific to Python]?
Threads should be used when you want two things to run at once, or want something to run in the background without slowing down the main process.
My recommendation is to only use threads if you have to. They generally add complexity to a program.
The main documentation for threading is here: http://docs.python.org/library/threading.html
Some examples are here:
http://www.devshed.com/c/a/Python/Basic-Threading-in-Python/1/
http://linuxgazette.net/107/pai.html
http://www.wellho.net/solutions/python-python-threads-a-first-example.html
One thing to remember before spending time and effort in writing a multi-threaded Python application is that there is a Global Interpreter Lock (GIL), so you won't actually be running more than one thread at a time.
This makes threading unsuitable for trying to take advantage of multiple cores or CPUs. You may get some speedup from multiplexing other resources (network, disk, ...), but it's never been particularly noticeable in my experience.
In general, I only use threads when there are several logically separate tasks happening at once, and yet I want them all in the same VM. A thread pulling data from the web and putting it on a Queue, while another thread pops from the Queue and writes to a database, something like that.
With Python 2.6, there is the new multiprocessing module which is pretty cool - it's got a very similar interface to the threading module, but actually spawns new OS processes, sidestepping the GIL.
There is a fantastic pdf, Tutorial on Threads Programming with Python by
Norman Matloff and Francis Hsu, of University of California, Davis.
Threads should be avoided whenever possible. They add much in complexity, synchronization issues and hard to debug issues. However some problems require them (i.e. GUI programming), but I would encourage you to look for a single-threaded solution if you can.
There are several tutorials here.
Keeping the GUI responsive while the application does some CPU-heavy processing is one of the challenges of effective GUI programming.
Here's a good discussion of how to do this in wxPython. To summarize, there are 3 ways:
Use threads
Use wxYield
Chunk the work and do it in the IDLE event handler
Which method have you found to be the most effective ? Techniques from other frameworks (like Qt, GTK or Windows API) are also welcome.
Threads. They're what I always go for because you can do it in every framework you need.
And once you're used to multi-threading and parallel processing in one language/framework, you're good on all frameworks.
Definitely threads. Why? The future is multi-core. Almost any new CPU has more than one core or if it has just one, it might support hyperthreading and thus pretending it has more than one. To effectively make use of multi-core CPUs (and Intel is planing to go up to 32 cores in the not so far future), you need multiple threads. If you run all in one main thread (usually the UI thread is the main thread), users will have CPUs with 8, 16 and one day 32 cores and your application never uses more than one of these, IOW it runs much, much slower than it could run.
Actual if you plan an application nowadays, I would go away of the classical design and think of a master/slave relationship. Your UI is the master, it's only task is to interact with the user. That is displaying data to the user and gathering user input. Whenever you app needs to "process any data" (even small amounts and much more important big ones), create a "task" of any kind, forward this task to a background thread and make the thread perform the task, providing feedback to the UI (e.g. how many percent it has completed or just if the task is still running or not, so the UI can show a "work-in-progress indicator"). If possible, split the task into many small, independent sub-tasks and run more than one background process, feeding one sub-task to each of them. That way your application can really benefit from multi-core and get faster the more cores CPUs have.
Actually companies like Apple and Microsoft are already planing on how to make their still most single threaded UIs themselves multithreaded. Even with the approach above, you may one day have the situation that the UI is the bottleneck itself. The background processes can process data much faster than the UI can present it to the user or ask the user for input. Today many UI frameworks are little thread-safe, many not thread-safe at all, but that will change. Serial processing (doing one task after another) is a dying design, parallel processing (doing many task at once) is where the future goes. Just look at graphic adapters. Even the most modern NVidia card has a pitiful performance, if you look at the processing speed in MHz/GHz of the GPU alone. How comes it can beat the crap out of CPUs when it comes to 3D calculations? Simple: Instead of calculating one polygon point or one texture pixel after another, it calculates many of them in parallel (actually a whole bunch at the same time) and that way it reaches a throughput that still makes CPUs cry. E.g. the ATI X1900 (to name the competitor as well) has 48 shader units!
I think delayedresult is what you are looking for:
http://www.wxpython.org/docs/api/wx.lib.delayedresult-module.html
See the wxpython demo for an example.
Threads or processes depending on the application. Sometimes it's actually best to have the GUI be it's own program and just send asynchronous calls to other programs when it has work to do. You'll still end up having multiple threads in the GUI to monitor for results, but it can simplify things if the work being done is complex and not directly connected to the GUI.
Threads -
Let's use a simple 2-layer view (GUI, application logic).
The application logic work should be done in a separate Python thread. For Asynchronous events that need to propagate up to the GUI layer, use wx's event system to post custom events. Posting wx events is thread safe so you could conceivably do it from multiple contexts.
Working in the other direction (GUI input events triggering application logic), I have found it best to home-roll a custom event system. Use the Queue module to have a thread-safe way of pushing and popping event objects. Then, for every synchronous member function, pair it with an async version that pushes the sync function object and the parameters onto the event queue.
This works particularly well if only a single application logic-level operation can be performed at a time. The benefit of this model is that synchronization is simple - each synchronous function works within it's own context sequentially from start to end without worry of pre-emption or hand-coded yielding. You will not need locks to protect your critical sections. At the end of the function, post an event to the GUI layer indicating that the operation is complete.
You could scale this to allow multiple application-level threads to exist, but the usual concerns with synchronization will re-appear.
edit - Forgot to mention the beauty of this is that it is possible to completely decouple the application logic from the GUI code. The modularity helps if you ever decide to use a different framework or use provide a command-line version of the app. To do this, you will need an intermediate event dispatcher (application level -> GUI) that is implemented by the GUI layer.
Working with Qt/C++ for Win32.
We divide the major work units into different processes. The GUI runs as a separate process and is able to command/receive data from the "worker" processes as needed. Works nicely in todays multi-core world.
This answer doesn't apply to the OP's question regarding Python, but is more of a meta-response.
The easy way is threads. However, not every platform has pre-emptive threading (e.g. BREW, some other embedded systems) If possibly, simply chunk the work and do it in the IDLE event handler.
Another problem with using threads in BREW is that it doesn't clean up C++ stack objects, so it's way too easy to leak memory if you simply kill the thread.
I use threads so the GUI's main event loop never blocks.
For some types of operations, using separate processes makes a lot of sense. Back in the day, spawning a process incurred a lot of overhead. With modern hardware this overhead is hardly even a blip on the screen. This is especially true if you're spawning a long running process.
One (arguable) advantage is that it's a simpler conceptual model than threads that might lead to more maintainable code. It can also make your code easier to test, as you can write test scripts that exercise these external processes without having to involve the GUI. Some might even argue that is the primary advantage.
In the case of some code I once worked on, switching from threads to separate processes led to a net reduction of over 5000 lines of code while at the same time making the GUI more responsive, the code easier to maintain and test, all while improving the total overall performance.