TideSDK and long processing loops in Python - python

Is there a way to do long processing loops in Python without freezing the GUI with TideSDK?
or I'll just have to use threads...
Thanks.

There's nothing really specific to TideSDK here—this is a general issue with any program built around an event loop, which means nearly all GUI apps and network servers, among other things.
There are three standard solutions:
Break the long task up into a bunch of small tasks, each of which schedules the next to get run.
Make the task call back to the event loop every so often.
Run the task in parallel.
For the first solution, most event-based frameworks have a method like doLater(func) or setTimeout(func, 0). If not, they have to at least have a way of posting a message to the event loop's queue, and you can pretty easily build a doLater around that. This kind of API can be horrible to use in C-like languages, and a bit obnoxious in JS just because of the bizarre this/scoping rules, but in Python and most other dynamic languages it's nearly painless.
Since TideSDK is built around a browser JS engine, it's almost certainly going to provide this first solution.
The second solution really only makes sense for frameworks built around either cooperative threadlets or explicit coroutines. However, some traditional single-threaded frameworks like classic Mac (and, therefore, modern Win32 and a few cross-platform frameworks like wxWindows) use this for running background jobs.
The first problem is that you have to deal with re-entrancy carefully (at least wx has a SafeYield to help a little), or you can end up with many of the same kinds of problems as threads—or, worse, everything seems to work except that under heavy use you occasionally get a stack crash from infinite recursion. The other problem is that it only really works well when there's only one heavy background task at a time, because it doesn't work so well
If your framework has a way of doing this, it'll have a function like yieldToOtherTasks or processNextEvent, and all you have to do is make sure to call that every once in a while. (However, if there's also a doLater, you should consider that first.) If there is no such method, this solution is not appropriate to your framework.
The third solution is to spin off a task via threading.Thread or multiprocessing.Process.
The problem with this parallelism is that you have to come up with some way to signal safely, and to share data safely. Some event-loop frameworks have a thread-safe "doLater" or "postEvent" method, and if the only signal you need is "task finished" and the only data you need to share are the task startup params and return values, everything is easy. But once that's not sufficient, things can get very complicated.
Also, if you have hundreds of long-running tasks to run, you probably don't want a thread or process for each one. In fact, you probably want a fixed-size pool of threads or processes, and then you'll have to break your tasks into small-enough subtasks so they don't starve each other out, so in effect you're doing all the work of solution #1 anyway.
However, there are cases where threads or processes are the simplest solution.

Related

asyncio performance in non-web, non-files apps

I’ve got a burning question. Recently I’ve been learning Asyncio in Python and found it very useful and efficient but here is my question: is it efficient to use it for “normal” things?
It’s obvious that using asynchronous operations for making requests, handling requests (in apis), working on files will give us performance gain. But how I put other operations? For example, if I want to do a lot of complicated mathematical operations or just standard operations (without files and web), would asyncio help me anyway? Is there any reason why we should use it outside our apps where we are not making requests and doing all this web or files stuff?
I’m wondering because in college teachers never mentioned that we couldn’t get any better by using it for just math or standard (local?, non-file, non-web) operations and I thought that we benefit from it (almost always). Am I totally wrong? Is it that way just in python or in every other language ?
asyncio in the first place is a convenient way to run multiple execution flows (compared to common alternatives like callbacks and threads).
Why would someone want to run multiple execution flows? Usually to gain performance, for example:
You don't want to waste time waiting one network request finished, so you starting another concurrently gaining performance
You don't want to waste time waiting one OS thread finished, so you starting another concurrently. In Python due to GIL you won't gain performance with threads for CPU-bound operations. But they can still be useful for network stuff or specifically in asyncio as a common way to run something blocking without freezing event loop.
You don't want to waste time waiting one OS process finished, so you starting another concurrently.
Last item is a way to gain performance even for purely CPU-bound operations (if machine have multiple cores). You can see example here (third option). asyncio here, again, is just a tool for convenient managing execution flows. Nothing stops you from using pure ProcessPoolExecutor and de-facto callbacks as shown here.

Python: Interruptable threading in wx

My wx GUI shows thumbnails, but they're slow to generate, so:
The program should remain usable while the thumbnails are generating.
Switching to a new folder should stop generating thumbnails for the old folder.
If possible, thumbnail generation should make use of multiple processors.
What is the best way to do this?
Putting the thumbnail generation in a background thread with threading.Thread will solve your first problem, making the program usable.
If you want a way to interrupt it, the usual way is to add a "stop" variable which the background thread checks every so often (e.g., once per thumbnail), and the GUI thread sets when it wants to stop it. Ideally you should protect this with a threading.Condition. (The condition isn't actually necessary in most cases—the same GIL that prevents your code from parallelizing well also protects you from certain kinds of race conditions. But you shouldn't rely on that.)
For the third problem, the first question is: Is thumbnail generation actually CPU-bound? If you're spending more time reading and writing images from disk, it probably isn't, so there's no point trying to parallelize it. But, let's assume that it is.
First, if you have N cores, you want a pool of N threads, or N-1 if the main thread has a lot of work to do too, or maybe something like 2N or 2N-1 to trade off a bit of best-case performance for a bit of worst-case performance.
However, if that CPU work is done in Python, or in a C extension that nevertheless holds the Python GIL, this won't help, because most of the time, only one of those threads will actually be running.
One solution to this is to switch from threads to processes, ideally using the standard multiprocessing module. It has built-in APIs to create a pool of processes, and to submit jobs to the pool with simple load-balancing.
The problem with using processes is that you no longer get automatic sharing of data, so that "stop flag" won't work. You need to explicitly create a flag in shared memory, or use a pipe or some other mechanism for communication instead. The multiprocessing docs explain the various ways to do this.
You can actually just kill the subprocesses. However, you may not want to do this. First, unless you've written your code carefully, it may leave your thumbnail cache in an inconsistent state that will confuse the rest of your code. Also, if you want this to be efficient on Windows, creating the subprocesses takes some time (not as in "30 minutes" or anything, but enough to affect the perceived responsiveness of your code if you recreate the pool every time a user clicks a new folder), so you probably want to create the pool before you need it, and keep it for the entire life of the program.
Other than that, all you have to get right is the job size. Hopefully creating one thumbnail isn't too big of a job—but if it's too small of a job, you can batch multiple thumbnails up into a single job—or, more simply, look at the multiprocessing API and change the way it batches jobs when load-balancing.
Meanwhile, if you go with a pool solution (whether threads or processes), if your jobs are small enough, you may not really need to cancel. Just drain the job queue—each worker will finish whichever job it's working on now, but then sleep until you feed in more jobs. Remember to also drain the queue (and then maybe join the pool) when it's time to quit.
One last thing to keep in mind is that if you successfully generate thumbnails as fast as your computer is capable of generating them, you may actually cause the whole computer—and therefore your GUI—to become sluggish and unresponsive. This usually comes up when your code is actually I/O bound and you're using most of the disk bandwidth, or when you use lots of memory and trigger swap thrash, but if your code really is CPU-bound, and you're having problems because you're using all the CPU, you may want to either use 1 fewer core, or look into setting thread/process priorities.

Python: Continuously and cancelably repeat execution with fixed interval

What is the best way to continuously repeat the execution of a given function at a fixed interval while being able to terminate the executor (thread or process) immediately?
Basically I know two approaches:
use multiprocessing and function with infinite cycle and time.sleep at the end. Processing is terminated with process.terminate() in any state.
use threading and constantly recreate timers at the end of the thread function. Processing is terminated by timer.cancel() while sleeping.
(both “in any state” and “while sleeping” are fine, even though the latter may be not immediate). The problem is that I have to use both multiprocessing and threading as the latter appears not to work on ARM (some fuzzy interaction of python interpreter and vim, outside of vim everything is fine) (I was using the second approach there, have not tried threading+cycle; no code is currently left) and the former spawns way too many processes which I would like not to see unless really required. This leads to a problem of having to code two different approaches while threading with cycle is just a few more imports for drop-in replacements of all multiprocessing stuff wrapped in if/else (except that there is no thread.terminate()). Is there some better way to do the job?
Currently used code is here (currently with cycle for both jobs), but I do not think it will be much useful to answer the question.
Update: The reason why I am using this solution are functions that display file status (and some other things like branch) in version control systems in vim statusline. These statuses must be updated, but updating them immediately cannot be done without using hooks and I have no idea how to set hooks temporary and remove on vim quit without possibly spoiling user configuration. Thus standard solution is cache expiring after N seconds. But when cache expired I need to do an expensive shell call and the delay appears to be noticeable, the more noticeable the heavier IO load is. What I am implementing now is updating values for viewed buffers each N seconds in a separate process thus delays are bothering that process and not me. Threads are likely to also work because GIL does not affect calls to external programs.
I'm not clear on why a single long-lived thread that loops infinitely over the tasks wouldn't work for you? Or why you end up with many processes in the multiprocess option?
My immediate reaction would have been a single thread with a queue to feed it things to do. But I may be misunderstanding the problem.
I do not know how do it simply and/or cleanly in Python, but I was wondering if maybe you couldn't take avantage of an existing system scheduler, e.g. crontab for *nix system.
There is an API in python and it might satisfied your needs.

How do I add two integers together with Twisted?

I have two integers in my program; let's call them "a" and "b". I would like to add them together and get another integer as a result. These are regular Python int objects. I'm wondering; how do I add them together with Twisted? Is there a special performAsynchronousAddition function somewhere? Do I need a Deferred? What about the reactor? Is the reactor involved?
OK, to be clear.
Twisted doesn't do anything about cpu bound tasks and for good reason. there's no way to make a compute bound job go any quicker by reordering subtasks; the only thing you could possibly do is add more compute resources; and even that wouldn't work out in python because of a subtlety of its implementation.
Twisted offers special semantics and event loop handling in case the program would become "stuck" waiting for something outside if its control; most normally a process running on another machine and communicating with your twisted process over a network connection. Since you would be waiting anyways, twisted gives you a mechanism to get more things done in the meantime. That is to say, twisted provides concurrency for I/O Bound tasks
tl;dr: twisted is for network code. Everything else is just normal python.
How about this:
c = a + b
That should work, and it doesn't need to be done asynchronously (it's pretty fast).
Good question, and Twisted (or Python) should have a way to at least spawn "a + b" of to several cores (on my 8 core i7).
Unfortunately the Python GIL prevents this from happening, meaning that you will have to wait, not only for the CPU bound task, but for one core doing the job, while the seven others core are doing nothing.
Note: Maybe a better example would be "a() + b()", or even "fact(sqrt(a()**b())" etc. but the important fact is that above operation will lock one core and the GIL pretty much prevents Python for doing anything else during that operation, which could be several ms...

Keeping GUIs responsive during long-running tasks

Keeping the GUI responsive while the application does some CPU-heavy processing is one of the challenges of effective GUI programming.
Here's a good discussion of how to do this in wxPython. To summarize, there are 3 ways:
Use threads
Use wxYield
Chunk the work and do it in the IDLE event handler
Which method have you found to be the most effective ? Techniques from other frameworks (like Qt, GTK or Windows API) are also welcome.
Threads. They're what I always go for because you can do it in every framework you need.
And once you're used to multi-threading and parallel processing in one language/framework, you're good on all frameworks.
Definitely threads. Why? The future is multi-core. Almost any new CPU has more than one core or if it has just one, it might support hyperthreading and thus pretending it has more than one. To effectively make use of multi-core CPUs (and Intel is planing to go up to 32 cores in the not so far future), you need multiple threads. If you run all in one main thread (usually the UI thread is the main thread), users will have CPUs with 8, 16 and one day 32 cores and your application never uses more than one of these, IOW it runs much, much slower than it could run.
Actual if you plan an application nowadays, I would go away of the classical design and think of a master/slave relationship. Your UI is the master, it's only task is to interact with the user. That is displaying data to the user and gathering user input. Whenever you app needs to "process any data" (even small amounts and much more important big ones), create a "task" of any kind, forward this task to a background thread and make the thread perform the task, providing feedback to the UI (e.g. how many percent it has completed or just if the task is still running or not, so the UI can show a "work-in-progress indicator"). If possible, split the task into many small, independent sub-tasks and run more than one background process, feeding one sub-task to each of them. That way your application can really benefit from multi-core and get faster the more cores CPUs have.
Actually companies like Apple and Microsoft are already planing on how to make their still most single threaded UIs themselves multithreaded. Even with the approach above, you may one day have the situation that the UI is the bottleneck itself. The background processes can process data much faster than the UI can present it to the user or ask the user for input. Today many UI frameworks are little thread-safe, many not thread-safe at all, but that will change. Serial processing (doing one task after another) is a dying design, parallel processing (doing many task at once) is where the future goes. Just look at graphic adapters. Even the most modern NVidia card has a pitiful performance, if you look at the processing speed in MHz/GHz of the GPU alone. How comes it can beat the crap out of CPUs when it comes to 3D calculations? Simple: Instead of calculating one polygon point or one texture pixel after another, it calculates many of them in parallel (actually a whole bunch at the same time) and that way it reaches a throughput that still makes CPUs cry. E.g. the ATI X1900 (to name the competitor as well) has 48 shader units!
I think delayedresult is what you are looking for:
http://www.wxpython.org/docs/api/wx.lib.delayedresult-module.html
See the wxpython demo for an example.
Threads or processes depending on the application. Sometimes it's actually best to have the GUI be it's own program and just send asynchronous calls to other programs when it has work to do. You'll still end up having multiple threads in the GUI to monitor for results, but it can simplify things if the work being done is complex and not directly connected to the GUI.
Threads -
Let's use a simple 2-layer view (GUI, application logic).
The application logic work should be done in a separate Python thread. For Asynchronous events that need to propagate up to the GUI layer, use wx's event system to post custom events. Posting wx events is thread safe so you could conceivably do it from multiple contexts.
Working in the other direction (GUI input events triggering application logic), I have found it best to home-roll a custom event system. Use the Queue module to have a thread-safe way of pushing and popping event objects. Then, for every synchronous member function, pair it with an async version that pushes the sync function object and the parameters onto the event queue.
This works particularly well if only a single application logic-level operation can be performed at a time. The benefit of this model is that synchronization is simple - each synchronous function works within it's own context sequentially from start to end without worry of pre-emption or hand-coded yielding. You will not need locks to protect your critical sections. At the end of the function, post an event to the GUI layer indicating that the operation is complete.
You could scale this to allow multiple application-level threads to exist, but the usual concerns with synchronization will re-appear.
edit - Forgot to mention the beauty of this is that it is possible to completely decouple the application logic from the GUI code. The modularity helps if you ever decide to use a different framework or use provide a command-line version of the app. To do this, you will need an intermediate event dispatcher (application level -> GUI) that is implemented by the GUI layer.
Working with Qt/C++ for Win32.
We divide the major work units into different processes. The GUI runs as a separate process and is able to command/receive data from the "worker" processes as needed. Works nicely in todays multi-core world.
This answer doesn't apply to the OP's question regarding Python, but is more of a meta-response.
The easy way is threads. However, not every platform has pre-emptive threading (e.g. BREW, some other embedded systems) If possibly, simply chunk the work and do it in the IDLE event handler.
Another problem with using threads in BREW is that it doesn't clean up C++ stack objects, so it's way too easy to leak memory if you simply kill the thread.
I use threads so the GUI's main event loop never blocks.
For some types of operations, using separate processes makes a lot of sense. Back in the day, spawning a process incurred a lot of overhead. With modern hardware this overhead is hardly even a blip on the screen. This is especially true if you're spawning a long running process.
One (arguable) advantage is that it's a simpler conceptual model than threads that might lead to more maintainable code. It can also make your code easier to test, as you can write test scripts that exercise these external processes without having to involve the GUI. Some might even argue that is the primary advantage.
In the case of some code I once worked on, switching from threads to separate processes led to a net reduction of over 5000 lines of code while at the same time making the GUI more responsive, the code easier to maintain and test, all while improving the total overall performance.

Categories

Resources