I'm writing a program in which I want to evaluate a piece of code asynchronously. I want it to be isolated from the main thread so that it can raise an error, enter an infinite loop, or just about anything else without disrupting the main program. I was hoping to use threading.Thread, but this has a major problem; I can't figure out how to stop it. I have tried Thread._stop(), but that frequently doesn't work. I end up with a thread that I can't control hogging both interpreter time and CPU power. The code in the thread doesn't open any files or do anything else that would cause problems if I hard-killed it.
Python's multiprocessing.Process.terminate() does this really well; unfortunately, initiating a process on Windows takes nearly a second, which is long enough to cause annoying delays in my GUI.
Does anyone know either a: how to kill a Python thread (I don't think I care how dirty the exit is), or b: how to speed up starting a process?
A third possibility would be a third-party library that provides an alternative method for asynchronous execution, but I've never heard of any such thing.
In my case, the best way to do this seems to be to maintain a running worker process, and send the code to it on an as-needed basis. If the process acts up, I kill it and then start a new one immediately to avoid any delay the next time.
Related
I am running an HTTP server (homemade, in C++) that embeds a Python interpreter for server-side scripting. This is a forking server, but I don't use any threading in any parent process. I don't do any weird things with the Python interpreter (other than the forks).
In one of the scripts, however, in another thread, a call to time.sleep(0.1) can take up to one minute, especially the first call.
while not self.should_stop():
# other code
print "[PYTHON]: Sleeping"
time.sleep(0.1)
print "[PYTHON]: Slept, checking should_stop"
I know that this is where it's hanging, because the logs show only the first print, and the second much, much later.
Additional information:
the CPU is not pegged (~5%)
this is Python 2.7 on Ubuntu
These are threading threads; I do use locks and events where necessary.
I don't import threading in any process that will ever do a fork
Python is initialized before the forks; this works great elsewhere (no problems in the last 6 months)
Python can run only one threading.Thread at a time, so if there are many threads, the interpreter has to constantly switch between them, so one thread can run while the others get freezed or, in other words, interrupted.
But an interrupted thread isn't told that it's freezed, it's sort of falls unconscious for a while and then is woken up and continues its work from where it has been interrupted. So, 0.5 seconds for one particular thread may in fact turn out to be longer in real life.
Fixed!
As it turns out, the main thread (the one embedding the interpreter, in C++) doesn't actually release the GIL when it's not executing Python code (as I imagined). You actually have to release the GIL manually, with Py_BEGIN_ALLOW_THREADS and Py_END_ALLOW_THREADS, as specified here.
This makes the runtime release the GIL so other threads can run during IO-intensive tasks (like, in my case, reading or writing to/from the network). No running Python code while doing that, though.
I am using requests to pull some files. I have noticed that the program seems to hang after some large number of iterations that varies from 5K to 20K. I can tell it is hanging because the folder where the results are stored has not changed in several hours. I have been trying to interrupt the process (I am using IDLE) by hitting CTRL + C to no avail. I would like to interrupt instead of killing the process because restart is easier. I have finally had to kill the process. I restart and it runs fine again until I have the same symptoms. I would like to figure out how to diagnose the problem but since I am having to kill everything I have no idea where to start.
Is there an alternate way to view what is going on or to more robustly interrupt the process?
I have been assuming that if I can interrupt without killing I can look at globals and or do some other mucking around to figure out where my code is hanging.
In case it's not too late: I've just faced the same problems and have some tips
First thing: In python most waiting apis are not interruptible (ie Thread.join(), Lock.acquire()...).
Have a look at theese pages for more informations:
http://snakesthatbite.blogspot.fr/2010/09/cpython-threading-interrupting.html
http://docs.python.org/2/library/thread.html
Then if a thread is waiting on such a call, it cannot be stopped.
There is another thing to know: if a normal thread is running (or hanged) the main program will stay indefinitely untill all threads are stopped or the process is killed.
To avoid that, you can make the thread a daemon thread: Thread.daemon=True before calling Thread.start().
Second thing, to find where your program is hanged, you can launch it with a debugger but I prefer logging because logs are always there in case its to late to debug.
Try logging before and after each waiting call to see how much time your threads have been hanged. To have high quality logs, uses python logging configured with file handler, html handler or even better with a syslog handler.
I have a Qt application written in PySide (Qt Python binding). This application has a GUI thread and many different QThreads that are in charge of performing some heavy lifting - some rather long tasks. As such long task sometimes gets stuck (usually because it is waiting for a server response), the application sometimes freezes.
I was therefore wondering if it is safe to call QCoreApplication.processEvents() "manually" every second or so, so that the GUI event queue is cleared (processed)? Is that a good idea at all?
It's safe to call QCoreApplication.processEvents() whenever you like. The docs explicitly state your use case:
You can call this function occasionally when your program is busy
performing a long operation (e.g. copying a file).
There is no good reason though why threads would block the event loop in the main thread, though. (Unless your system really can't keep up.) So that's worth looking into anyway.
A couple of hints people might find useful:
A. You need to beware of the following:
Every so often the threads want to send stuff back to the main thread. So they post an event and call processEvents
If the code runs from the event also calls processEvents then instead of returning to the next statement, python can instead dispatch a worker thread again and that can then repeat this process.
The net result of this can be hundreds or thousands of nested processEvent statements which can then result in a recursion level exceeded error message.
Moral - if you are running a multi-threaded application do NOT call processEvents in any code initiated by a thread which runs in the main thread.
B. You need to be aware that CPython has a Global Interpreter Lock (GIL) that limits threads so that only one can run at any one time and the way that Python decides which threads to run is counter-intuitive. Running process events from a worker thread does not seem to do what it says on the can, and CPU time is not allocated to the main thread or to Python internal threads. I am still experimenting, but it seems that putting worker threads to sleep for a few miliseconds allows other threads to get a look in.
What is the best way to continuously repeat the execution of a given function at a fixed interval while being able to terminate the executor (thread or process) immediately?
Basically I know two approaches:
use multiprocessing and function with infinite cycle and time.sleep at the end. Processing is terminated with process.terminate() in any state.
use threading and constantly recreate timers at the end of the thread function. Processing is terminated by timer.cancel() while sleeping.
(both “in any state” and “while sleeping” are fine, even though the latter may be not immediate). The problem is that I have to use both multiprocessing and threading as the latter appears not to work on ARM (some fuzzy interaction of python interpreter and vim, outside of vim everything is fine) (I was using the second approach there, have not tried threading+cycle; no code is currently left) and the former spawns way too many processes which I would like not to see unless really required. This leads to a problem of having to code two different approaches while threading with cycle is just a few more imports for drop-in replacements of all multiprocessing stuff wrapped in if/else (except that there is no thread.terminate()). Is there some better way to do the job?
Currently used code is here (currently with cycle for both jobs), but I do not think it will be much useful to answer the question.
Update: The reason why I am using this solution are functions that display file status (and some other things like branch) in version control systems in vim statusline. These statuses must be updated, but updating them immediately cannot be done without using hooks and I have no idea how to set hooks temporary and remove on vim quit without possibly spoiling user configuration. Thus standard solution is cache expiring after N seconds. But when cache expired I need to do an expensive shell call and the delay appears to be noticeable, the more noticeable the heavier IO load is. What I am implementing now is updating values for viewed buffers each N seconds in a separate process thus delays are bothering that process and not me. Threads are likely to also work because GIL does not affect calls to external programs.
I'm not clear on why a single long-lived thread that loops infinitely over the tasks wouldn't work for you? Or why you end up with many processes in the multiprocess option?
My immediate reaction would have been a single thread with a queue to feed it things to do. But I may be misunderstanding the problem.
I do not know how do it simply and/or cleanly in Python, but I was wondering if maybe you couldn't take avantage of an existing system scheduler, e.g. crontab for *nix system.
There is an API in python and it might satisfied your needs.
I would like to kill a function that executes to long. What is important this function is inside C extension (wrapped in Cython), and I would like this solution to work in multithreaded enviorment. Since it is wrapped in Cython this thread could hold GIL.
I have no control whatsoever on what is happening inside this extension (and I think that this code will not respond to interrupts).
I'm fairly certain that this code will be only run on Unix machines. But question Python kill hanging function does not apply because I think that signals would not work in multithreaded enviorment (AFAIK it is undefined which thread will catch them) --- but I might be wrong on this one :) so correct me.
Is there any way for me to resolve this without spawning new processes.
My solution is to wrap this function in another python process and if needed kill that process.
A piece of advice for anyone who googles this question: since process startup time (starting interpreter, loading modules and then loading data into memory) can last couple of seconds you need to group your function calls so this overhead won;t kill you (so there is no reusable solution really).
Example solution, was posted to na another question: How to interrupt native extension code without killing the interpreter?.