Python: Interruptable threading in wx - python

My wx GUI shows thumbnails, but they're slow to generate, so:
The program should remain usable while the thumbnails are generating.
Switching to a new folder should stop generating thumbnails for the old folder.
If possible, thumbnail generation should make use of multiple processors.
What is the best way to do this?

Putting the thumbnail generation in a background thread with threading.Thread will solve your first problem, making the program usable.
If you want a way to interrupt it, the usual way is to add a "stop" variable which the background thread checks every so often (e.g., once per thumbnail), and the GUI thread sets when it wants to stop it. Ideally you should protect this with a threading.Condition. (The condition isn't actually necessary in most cases—the same GIL that prevents your code from parallelizing well also protects you from certain kinds of race conditions. But you shouldn't rely on that.)
For the third problem, the first question is: Is thumbnail generation actually CPU-bound? If you're spending more time reading and writing images from disk, it probably isn't, so there's no point trying to parallelize it. But, let's assume that it is.
First, if you have N cores, you want a pool of N threads, or N-1 if the main thread has a lot of work to do too, or maybe something like 2N or 2N-1 to trade off a bit of best-case performance for a bit of worst-case performance.
However, if that CPU work is done in Python, or in a C extension that nevertheless holds the Python GIL, this won't help, because most of the time, only one of those threads will actually be running.
One solution to this is to switch from threads to processes, ideally using the standard multiprocessing module. It has built-in APIs to create a pool of processes, and to submit jobs to the pool with simple load-balancing.
The problem with using processes is that you no longer get automatic sharing of data, so that "stop flag" won't work. You need to explicitly create a flag in shared memory, or use a pipe or some other mechanism for communication instead. The multiprocessing docs explain the various ways to do this.
You can actually just kill the subprocesses. However, you may not want to do this. First, unless you've written your code carefully, it may leave your thumbnail cache in an inconsistent state that will confuse the rest of your code. Also, if you want this to be efficient on Windows, creating the subprocesses takes some time (not as in "30 minutes" or anything, but enough to affect the perceived responsiveness of your code if you recreate the pool every time a user clicks a new folder), so you probably want to create the pool before you need it, and keep it for the entire life of the program.
Other than that, all you have to get right is the job size. Hopefully creating one thumbnail isn't too big of a job—but if it's too small of a job, you can batch multiple thumbnails up into a single job—or, more simply, look at the multiprocessing API and change the way it batches jobs when load-balancing.
Meanwhile, if you go with a pool solution (whether threads or processes), if your jobs are small enough, you may not really need to cancel. Just drain the job queue—each worker will finish whichever job it's working on now, but then sleep until you feed in more jobs. Remember to also drain the queue (and then maybe join the pool) when it's time to quit.
One last thing to keep in mind is that if you successfully generate thumbnails as fast as your computer is capable of generating them, you may actually cause the whole computer—and therefore your GUI—to become sluggish and unresponsive. This usually comes up when your code is actually I/O bound and you're using most of the disk bandwidth, or when you use lots of memory and trigger swap thrash, but if your code really is CPU-bound, and you're having problems because you're using all the CPU, you may want to either use 1 fewer core, or look into setting thread/process priorities.

Related

Thread are not happening at the same time?

I have a program that fetches data via an API. I created a function that only takes the target data as an argument and with a for-loop I run this method 10 times.
The programm takes quite some time to display the data because the next function call only happens when the function before has done its work.
I want to use Threads to make it all happen quicker. However, I'm confused. On realpython.org I read this:
A thread is a separate flow of execution. This means that your program will have two things happening at once. But for most Python 3 implementations the different threads do not actually execute at the same time: they merely appear to. It’s tempting to think of threading as having two (or more) different processors running on your program, each one doing an independent task at the same time. That’s almost right. The threads may be running on different processors, but they will only be running one at a time.
First they say: "This means that your program will have two things happening at once" and then they say "but they will only be running one at a time". So my threads are not done simultaneously?
I want to make a decision on whether to use Threads or Multiprocessing but I can't figure it out.
Can somebody help?
With both Threads or Multiprocessing you must assume that execution of your program could jump from one thread/process to another randomly. The difference is that with Threads, code is never really executed at the same time. That means there is always only one CPU core doing your work. With Multiprocessing, your code runs on multiple cores at the same time. So only Multiprocessing will solve your computation N times faster with N processes. (There will be some overhead of course.) If you are not doing any heavy computation, but need to create the illusion of things running in parallel, use threads. This is especially useful for GUIs.
The confusing part is that IO (copying files or loading something from the web for example) is not CPU bound, as it does not require a lot of CPU instructions to happen. So always use threads for this. To understand it a bit more, you should realise that when a thread is waiting for an IO operation to finish, it is actually in a blocked state. This allows other threads to run. So if you use threads to fetch data the first thread will start loading it and then block. This makes room for the the second thread to do the same and so on. When one of the threads has the data ready, it will unblock, run the rest of its code and finish.
(Note that when multiple threads are running they can pause randomly and give room for other threads to run for a while and then carry on. (See first sentence of this answer.))
Generally always use threads unless you need to do something CPU heavy in parallel. Multiprocessing has a lot of limitations when it comes to how it works internally and using it is more complicated and heavy.
This only applies to some implementations of Python tough, for example the most commonly used "official" implementation, CPython. In other languages or less common Python implementations threads are often able to execute instructions on multiple cores at the same time.

Python: Continuously and cancelably repeat execution with fixed interval

What is the best way to continuously repeat the execution of a given function at a fixed interval while being able to terminate the executor (thread or process) immediately?
Basically I know two approaches:
use multiprocessing and function with infinite cycle and time.sleep at the end. Processing is terminated with process.terminate() in any state.
use threading and constantly recreate timers at the end of the thread function. Processing is terminated by timer.cancel() while sleeping.
(both “in any state” and “while sleeping” are fine, even though the latter may be not immediate). The problem is that I have to use both multiprocessing and threading as the latter appears not to work on ARM (some fuzzy interaction of python interpreter and vim, outside of vim everything is fine) (I was using the second approach there, have not tried threading+cycle; no code is currently left) and the former spawns way too many processes which I would like not to see unless really required. This leads to a problem of having to code two different approaches while threading with cycle is just a few more imports for drop-in replacements of all multiprocessing stuff wrapped in if/else (except that there is no thread.terminate()). Is there some better way to do the job?
Currently used code is here (currently with cycle for both jobs), but I do not think it will be much useful to answer the question.
Update: The reason why I am using this solution are functions that display file status (and some other things like branch) in version control systems in vim statusline. These statuses must be updated, but updating them immediately cannot be done without using hooks and I have no idea how to set hooks temporary and remove on vim quit without possibly spoiling user configuration. Thus standard solution is cache expiring after N seconds. But when cache expired I need to do an expensive shell call and the delay appears to be noticeable, the more noticeable the heavier IO load is. What I am implementing now is updating values for viewed buffers each N seconds in a separate process thus delays are bothering that process and not me. Threads are likely to also work because GIL does not affect calls to external programs.
I'm not clear on why a single long-lived thread that loops infinitely over the tasks wouldn't work for you? Or why you end up with many processes in the multiprocess option?
My immediate reaction would have been a single thread with a queue to feed it things to do. But I may be misunderstanding the problem.
I do not know how do it simply and/or cleanly in Python, but I was wondering if maybe you couldn't take avantage of an existing system scheduler, e.g. crontab for *nix system.
There is an API in python and it might satisfied your needs.

How does threading or Multiprocessing work with recursions?

Background
I'm a bit new to developing and had a general python/programming question. If you have a method that is a recursion, what is involved to enabling multiple threads or multiprocessing? I've done some light reading and a few examples but they seem to be applying the syntax for new code(and not very cpu intensive tasks), I'm more wondering how do I re-design existing code to do this?
Say I have something thats cpu intensive(basically keeps adding to itself until limit is hit):
def adderExample(sum, number):
if sum > 1000:
print 'sum is larger than 10. Stoping'
else:
sum = sum + number
print sum
number = number + 1
adderExample(sum, number)
adderExample(0,0)
Question(s)/Though process
How would I approach this to make it run faster assuming I have multiple cores available(I want it to eventually want it span machines but I think thats a sperate issue with hadoop so I'll keep this example to only one system with multiple cpu's)? It seems threading it isn't the best choice(because of the time it takes to spawn new threads), if thats true should I only focus on multiprocessing? If so, can recursions be split to different cpu's(vai queues I assume and then rejoin after its done)? Can I create multiple threads for each process than split those processes over multiple cpu's? Lastly, is recursion depth limits an overall limit or is it based on threads/proceses, if so does multiprocessing/threading get around it?
Another question(related) how do those guys trying to codes(rsa, wireless keys,etc) via brute force overcome this problem? I assume they are scaling their mathematical processes over multiple cpu somehow. This or any example to build my understanding would be great.
Any tips/suggestions would be great
Thanks!
Such a loop wouldn't benefit much at all from threading. Consider that you're doing a series of additions, whose intermediate values depend on the previous iterations. This can't be parallelized, because the threads would be stomping on each other's values and overwriting things. You can lock the data so only one thread works on it at a time, but then you lose any benefit of having multiple threads working on that data.
Threads work best when they have independent data sets. e.g. a graphics renderer is a perfect example. Each thread renders a subset of the larger image - they may share common data sources for texture/vertex/color/etc... data, but each thread has its own little section of the total image to work one, and doesn't touch other areas of the image. Whatever thread #1 does on its little section of pixels won't affect what thread #2 is doing elsewhere in the image.
For your related question, password cracking is another example where threading/multiprocessing makes sense. Each thread goes off on its own testing multiple possible passwords against one common "to be cracked" list. What one thread is doing doesn't affect any of the other cracker threads, unless you get a match, which may mean all threads abort since the job is "done".
Once threads become interdependent on each other, you lose a lot of the benefits of having multiple threads. They'll spend more time waiting for the other to finish than they'll spend on doing actual work. Of course, this doesn't say you should never use threads. Sometimes it does makes sense to have multiple threads, even if they are interdependent. E.g. a graphics thread + sound effects thread + action processor thread + A.I. calculations thread, etc... in a game. each one is nominally dependent on each other, but while the sound thread is busy generating the bang+ricochet audio for the gun the player just shot, the a.i. thread is off calculating what the game's mobs are doing, the graphics thread is drawing some clouds in the background, etc...
Threading kinda sorta implies multiple stacks, recursion single stacks. That said, if you get to the recurse-left, recurse-right part and decide to spawn threads for the sub-problems if the current count of threads is "low" and do straight recursion otherwise you can combine the concepts.
But regular Python is not a good language for this pattern. Python threads all run on the same interpreter hardware thread, so you won't actually pick up any multiprocessing goodness.
Phunctor is correct that the threading library is a poor choice for parallelizing this type of problem, due to the "Global Interpreter Lock" that prevents multiple threads from executing Python code in parallel.
Where the threading library can be highly useful, though, is when each thread's code spends a lot of time waiting for I/O to happen. So, for example, if you're implementing a server that has to hit the disk or wait on a network response, servicing a request in each thread can be very efficient, since the threading library can favor the ones that are not waiting on I/O and thus maximize use of the Python interpreter. (In a single thread, you'd have to use a tight loop checking the statuses of your I/O requests, which would tend to be wasteful as load got high.)

python program choice

My program is ICAPServer (similar with httpserver), it's main job is to receive data from clients and save the data to DB.
There are two main steps and two threads:
ICAPServer receives data from clients, puts the data in a queue (50kb <1ms);
another thread pops data from the queue, and writes them to DB SO, if 2nd step is too slow, the queue will fill up memory with those data.
Wondering if anyone have any suggestion...
It is hard to say for sure, but perhaps using two processes instead of threads will help in this situation. Since Python has the Global Interpreter Lock (GIL), it has the effect of only allowing any one thread to execute Python instructions at any time.
Having a system designed around processes might have the following advantages:
Higher concurrency, especially on multiprocessor machines
Greater throughput, since you can probably spawn multiple queue consumers / DB writer processes to spread out the work. Although, the impact of this might be minimal if it is really the DB that is the bottleneck and not the process writing to the DB.
One note: before going for optimizations, it is very important to get some good measurement, and profiling.
That said, I would bet the slow part in the second step is database communication; you could try to analyze the SQL statement and its execution plan. and then optimize it (it is one of the features of SQLAlchemy); if still it would be too slow, check about database optimizations.
Of course, it is possible the bottleneck would be in a completely different place; in this case, you still have chances to optimize using C code, dedicated network, or more threads - just to give three possible example of completely different kind of optimizations.
Another point: as I/O operations usually release the GIL, you could also try to improve performance just by adding another reader thread - and I think this could be a much cheaper solution.
Put an upper limit on the amount of data in the queue?

Keeping GUIs responsive during long-running tasks

Keeping the GUI responsive while the application does some CPU-heavy processing is one of the challenges of effective GUI programming.
Here's a good discussion of how to do this in wxPython. To summarize, there are 3 ways:
Use threads
Use wxYield
Chunk the work and do it in the IDLE event handler
Which method have you found to be the most effective ? Techniques from other frameworks (like Qt, GTK or Windows API) are also welcome.
Threads. They're what I always go for because you can do it in every framework you need.
And once you're used to multi-threading and parallel processing in one language/framework, you're good on all frameworks.
Definitely threads. Why? The future is multi-core. Almost any new CPU has more than one core or if it has just one, it might support hyperthreading and thus pretending it has more than one. To effectively make use of multi-core CPUs (and Intel is planing to go up to 32 cores in the not so far future), you need multiple threads. If you run all in one main thread (usually the UI thread is the main thread), users will have CPUs with 8, 16 and one day 32 cores and your application never uses more than one of these, IOW it runs much, much slower than it could run.
Actual if you plan an application nowadays, I would go away of the classical design and think of a master/slave relationship. Your UI is the master, it's only task is to interact with the user. That is displaying data to the user and gathering user input. Whenever you app needs to "process any data" (even small amounts and much more important big ones), create a "task" of any kind, forward this task to a background thread and make the thread perform the task, providing feedback to the UI (e.g. how many percent it has completed or just if the task is still running or not, so the UI can show a "work-in-progress indicator"). If possible, split the task into many small, independent sub-tasks and run more than one background process, feeding one sub-task to each of them. That way your application can really benefit from multi-core and get faster the more cores CPUs have.
Actually companies like Apple and Microsoft are already planing on how to make their still most single threaded UIs themselves multithreaded. Even with the approach above, you may one day have the situation that the UI is the bottleneck itself. The background processes can process data much faster than the UI can present it to the user or ask the user for input. Today many UI frameworks are little thread-safe, many not thread-safe at all, but that will change. Serial processing (doing one task after another) is a dying design, parallel processing (doing many task at once) is where the future goes. Just look at graphic adapters. Even the most modern NVidia card has a pitiful performance, if you look at the processing speed in MHz/GHz of the GPU alone. How comes it can beat the crap out of CPUs when it comes to 3D calculations? Simple: Instead of calculating one polygon point or one texture pixel after another, it calculates many of them in parallel (actually a whole bunch at the same time) and that way it reaches a throughput that still makes CPUs cry. E.g. the ATI X1900 (to name the competitor as well) has 48 shader units!
I think delayedresult is what you are looking for:
http://www.wxpython.org/docs/api/wx.lib.delayedresult-module.html
See the wxpython demo for an example.
Threads or processes depending on the application. Sometimes it's actually best to have the GUI be it's own program and just send asynchronous calls to other programs when it has work to do. You'll still end up having multiple threads in the GUI to monitor for results, but it can simplify things if the work being done is complex and not directly connected to the GUI.
Threads -
Let's use a simple 2-layer view (GUI, application logic).
The application logic work should be done in a separate Python thread. For Asynchronous events that need to propagate up to the GUI layer, use wx's event system to post custom events. Posting wx events is thread safe so you could conceivably do it from multiple contexts.
Working in the other direction (GUI input events triggering application logic), I have found it best to home-roll a custom event system. Use the Queue module to have a thread-safe way of pushing and popping event objects. Then, for every synchronous member function, pair it with an async version that pushes the sync function object and the parameters onto the event queue.
This works particularly well if only a single application logic-level operation can be performed at a time. The benefit of this model is that synchronization is simple - each synchronous function works within it's own context sequentially from start to end without worry of pre-emption or hand-coded yielding. You will not need locks to protect your critical sections. At the end of the function, post an event to the GUI layer indicating that the operation is complete.
You could scale this to allow multiple application-level threads to exist, but the usual concerns with synchronization will re-appear.
edit - Forgot to mention the beauty of this is that it is possible to completely decouple the application logic from the GUI code. The modularity helps if you ever decide to use a different framework or use provide a command-line version of the app. To do this, you will need an intermediate event dispatcher (application level -> GUI) that is implemented by the GUI layer.
Working with Qt/C++ for Win32.
We divide the major work units into different processes. The GUI runs as a separate process and is able to command/receive data from the "worker" processes as needed. Works nicely in todays multi-core world.
This answer doesn't apply to the OP's question regarding Python, but is more of a meta-response.
The easy way is threads. However, not every platform has pre-emptive threading (e.g. BREW, some other embedded systems) If possibly, simply chunk the work and do it in the IDLE event handler.
Another problem with using threads in BREW is that it doesn't clean up C++ stack objects, so it's way too easy to leak memory if you simply kill the thread.
I use threads so the GUI's main event loop never blocks.
For some types of operations, using separate processes makes a lot of sense. Back in the day, spawning a process incurred a lot of overhead. With modern hardware this overhead is hardly even a blip on the screen. This is especially true if you're spawning a long running process.
One (arguable) advantage is that it's a simpler conceptual model than threads that might lead to more maintainable code. It can also make your code easier to test, as you can write test scripts that exercise these external processes without having to involve the GUI. Some might even argue that is the primary advantage.
In the case of some code I once worked on, switching from threads to separate processes led to a net reduction of over 5000 lines of code while at the same time making the GUI more responsive, the code easier to maintain and test, all while improving the total overall performance.

Categories

Resources