Is there anyway to terminate a running function from a thread? - python

I've tried lately to write my own Socket-Server in python.
While i was writing a thread to handle server commands (sort of command line in the server), I've tried to implement a code that will restart the server when the raw_input() receives specific command.
Basically, i want to restart the server as soon as the "Running" variable changes its state from True to False, and when it does, i would like to stop the function (The function that called the thread) from running (get back to main function) and then run it again. Is there a way to do it?
Thank you very much, and i hope i was clear about my problem,
Idan :)

Communication between threads can be done with Events, Queues, Semaphores, etc. Check them out and choose the one, that fits your problem best.

You can't abort a thread, or raise an exception into it asynchronously, in Python.
The standard Unix solution to this problem is to use a non-blocking socket, create a pipe with pipe, replace all your blocking sock.recv calls with a blocking r, _, _ = select.select([sock, pipe], [], []), and then the other thread can write to the pipe to wake up the other thread.
To make this portable to Windows you'll need to create a UDP localhost socket instead of a pipe, which makes things slightly more complicated, but it's still not hard.
Or, of course, you can use a higher-level framework, like asyncio in 3.4+, or twisted or another third-party lib, which will wrap this up for you. (Most of them are already running the equivalent of a loop around select to service lots of clients in one thread or a small thread pool, so it's trivial to toss in a stop pipe.)
Are there other alternatives? Yes, but all less portable and less good in a variety of other ways.
Most platforms have a way to asynchronously kill or signal another thread, which you can access via, e.g., ctypes. But this is a bad idea, because it will prevent Python from doing any normal cleanup. Even if you don't get a segfault, this could mean files never get flushed and end up with incomplete/garbage data, locks are left acquired to deadlock your program somewhere completely unrelated a short time later, memory gets leaked, etc.
If you're specifically trying to interrupt the main thread, and you only care about CPython on Unix, you can use a signal handler and the kill function. The signal will take effect on the next Python bytecode, and if the interpreter is blocked on any kind of I/O (or most other syscalls, e.g., inside a sleep), the system will return to the interpreter with an EINTR, allowing it to interrupt immediately. If the interpreter is blocked on something else, like a call to a C library that blocks signals or just does nothing but CPU work for 30 seconds, then you'll have to wait 30 seconds (although that doesn't come up that often, and you should know if it will in your case). Also, threads and signals don't play nice on some older *nix platforms. And signals don't work the same way on Windows, or in some other Python implementations like Jython.
On some platforms (including Windows--but not most modern *nix plafforms), you can wake up a blocking socket call just by closing the socket out from under the waiting thread. On other platforms, this will not unblock the thread, or will do it sometimes but not other times (and theoretically it could even segfault your program or leave the socket library in an unusable state, although I don't think either of those will happen on any modern platform).

As far as I understand the documentation, and some experiments I've over the last weeks, there is no way to really force another thread to 'stop' or 'abort'. Unless the function is aware of the possibility of being stopped and has a foolproof method of avoiding getting stuck in some of the I/O functions. Then you can use some communication method such as semaphores. The only exception is the specialized Timer function, which has a Cancel method.
So, if you really want to stop the server thread forcefully, you might want to think about running it in a separate process, not a thread.
EDIT: I'm not sure why you want to restart the server - I just thought it was in case of a failure. Normal procedure in a server is to loop waiting for connections on the socket, and when a connection appears, attend it and return to that loop.
A better way, is to use the GIO library (part of glib), and connect methods to the connection event, to attend the connection even asynchronously. This avoids the loop completely. I don't have any real code for this in Python, but here's an example of a client in Python (which uses GIO for reception events) and a server in C, which uses GIO for connections.
Use of GIO makes life so much easier...

Related

Python time.sleep taking much longer

I am running an HTTP server (homemade, in C++) that embeds a Python interpreter for server-side scripting. This is a forking server, but I don't use any threading in any parent process. I don't do any weird things with the Python interpreter (other than the forks).
In one of the scripts, however, in another thread, a call to time.sleep(0.1) can take up to one minute, especially the first call.
while not self.should_stop():
# other code
print "[PYTHON]: Sleeping"
time.sleep(0.1)
print "[PYTHON]: Slept, checking should_stop"
I know that this is where it's hanging, because the logs show only the first print, and the second much, much later.
Additional information:
the CPU is not pegged (~5%)
this is Python 2.7 on Ubuntu
These are threading threads; I do use locks and events where necessary.
I don't import threading in any process that will ever do a fork
Python is initialized before the forks; this works great elsewhere (no problems in the last 6 months)
Python can run only one threading.Thread at a time, so if there are many threads, the interpreter has to constantly switch between them, so one thread can run while the others get freezed or, in other words, interrupted.
But an interrupted thread isn't told that it's freezed, it's sort of falls unconscious for a while and then is woken up and continues its work from where it has been interrupted. So, 0.5 seconds for one particular thread may in fact turn out to be longer in real life.
Fixed!
As it turns out, the main thread (the one embedding the interpreter, in C++) doesn't actually release the GIL when it's not executing Python code (as I imagined). You actually have to release the GIL manually, with Py_BEGIN_ALLOW_THREADS and Py_END_ALLOW_THREADS, as specified here.
This makes the runtime release the GIL so other threads can run during IO-intensive tasks (like, in my case, reading or writing to/from the network). No running Python code while doing that, though.

Listening for events on a network and handling callbacks robostly

I am developing a small Python program for the Raspberry Pi that listens for some events on a Zigbee network.
The way I've written this is rather simplisic, I have a while(True): loop checking for a Uniquie ID (UID) from the Zigbee. If a UID is received it's sent to a dictionary containing some callback methods. So, for instance, in the dictionary the key 101 is tied to a method called PrintHello().
So if that key/UID is received method PrintHello will be executed - pretty simple, like so:
if self.expectedCallBacks.has_key(UID) == True:
self.expectedCallBacks[UID]()
I know this approach is probably too simplistic. My main concern is, what if the system is busy handling a method and the system receives another message?
On an embedded MCU I can handle easily with a circuler buffer + interrupts but I'm a bit lost with it comes to doing this with a RPi. Do I need to implement a new thread for the Zigbee module that basically fills a buffer that the call back handler can then retrieve/read from?
I would appreciate any suggestions on how to implement this more robustly.
Threads can definitely help to some degree here. Here's a simple example using a ThreadPool:
from multiprocessing.pool import ThreadPool
pool = ThreadPool(2) # Create a 2-thread pool
while True:
uid = zigbee.get_uid()
if uid in self.expectedCallbacks:
pool.apply_async(self.expectedCallbacks[UID])
That will kick off the callback in a thread in the thread pool, and should help prevent events from getting backed up before you can send them to a callback handler. The ThreadPool will internally handle queuing up any tasks that can't be run when all the threads in the pool are already doing work.
However, remember that Raspberry Pi's have only one CPU core, so you can't execute more than one CPU-based operation concurrently (and that's even ignoring the limitations of threading in Python caused by the GIL, which is normally solved by using multiple processes instead of threads). That means no matter how many threads/processes you have, only one can get access to the CPU at a time. For that reason, you probably don't want more than one thread actually running the callbacks, since as you add more you're just going to slow things down, due to the OS needing to constantly switch between threads.

Multi-Threading and Asynchronous sockets in python

I'm quite new to python threading/network programming, but have an assignment involving both of the above.
One of the requirements of the assignment is that for each new request, I spawn a new thread, but I need to both send and receive at the same time to the browser.
I'm currently using the asyncore library in Python to catch each request, but as I said, I need to spawn a thread for each request, and I was wondering if using both the thread and the asynchronous is overkill, or the correct way to do it?
Any advice would be appreciated.
Thanks
EDIT:
I'm writing a Proxy Server, and not sure if my client is persistent. My client is my browser (using firefox for simplicity)
It seems to reconnect for each request. My problem is that if I open a tab with http://www.google.com in it, and http://www.stackoverflow.com in it, I only get one request at a time from each tab, instead of multiple requests from google, and from SO.
I answered a question that sounds amazingly similar to your, where someone had a homework assignment to create a client server setup, with each connection being handled in a new thread: https://stackoverflow.com/a/9522339/496445
The general idea is that you have a main server loop constantly looking for a new connection to come in. When it does, you hand it off to a thread which will then do its own monitoring for new communication.
An extra bit about asyncore vs threading
From the asyncore docs:
There are only two ways to have a program on a single processor do
“more than one thing at a time.” Multi-threaded programming is the
simplest and most popular way to do it, but there is another very
different technique, that lets you have nearly all the advantages of
multi-threading, without actually using multiple threads. It’s really
only practical if your program is largely I/O bound. If your program
is processor bound, then pre-emptive scheduled threads are probably
what you really need. Network servers are rarely processor bound,
however.
As this quote suggests, using asyncore and threading should be for the most part mutually exclusive options. My link above is an example of the threading approach, where the server loop (either in a separate thread or the main one) does a blocking call to accept a new client. And when it gets one, it spawns a thread which will then continue to handle the communication, and the server goes back into a blocking call again.
In the pattern of using asyncore, you would instead use its async loop which will in turn call your own registered callbacks for various activity that occurs. There is no threading here, but rather a polling of all the open file handles for activity. You get the sense of doing things all concurrently, but under the hood it is scheduling everything serially.

receiving a linux signal and interating with threads

hello to you all :)
i have a program that have a n number of threads(could be a lot) and they do a pretty extensive job. My problem is that sometimes some people turn off or reboot the server(the program runs all day in the company servers) i know that there is a way to make a handler for the linux signals i want to know what i should do to interact with all threads making them to use run a function and then stop working. There is a way to do that?
sorry the bad english :P
The best way of handling this is not requiring any shutdown actions at all.
For example, your signal handler for (e.g.) SIGTERM or SIGQUIT can just call _exit and quit the process with no clean-up.
Under Linux (with non-ancient threads) when one thread calls _exit (or exit if you really want) other threads get stopped too - whatever they were in the middle of doing.
This would be good as it implements a crash-only design.
Crash-only design for a server is based on the principle that the machine may crash at any point, so you need to be able to recover from such a failure anyway, so just make it the normal way of quitting. No extra code should be required as your server should be robust enough anyway.
About the only thing you can do is set a global variable from your signal handler, and have your threads check its value periodically.
As others have already mentioned, signal handlers can get messy (due to the restrictions, particularly in multi-threaded programs), so it's better to chose another option:
have a dedicated thread for handling signals via sigwaitinfo - the bad news, though, is that python doesn't appear to support that out of the box.
use the Linux-specific signalfd to handle signals (either in a separate thread or integrated into some event loop) - at least there is a python-signalfd module you can use.
As there is no need to install signal handlers here, there is no restriction on what you can do when you are notified of a signal and it should be easy to shut down the others threads in your program cleanly.

Terminate long running python threads

What is the recommended way to terminate unexpectedly long running threads in python ? I can't use SIGALRM, since
Some care must be taken if both
signals and threads are used in the
same program. The fundamental thing to
remember in using signals and threads
simultaneously is: always perform
signal() operations in the main thread
of execution. Any thread can perform
an alarm(), getsignal(), pause(),
setitimer() or getitimer(); only the
main thread can set a new signal
handler, and the main thread will be
the only one to receive signals
(this is enforced by the Python signal
module, even if the underlying thread
implementation supports sending
signals to individual threads). This
means that signals can’t be used as a
means of inter-thread
communication.Use locks instead.
Update: each thread in my case blocks -- it is downloading a web page using urllib2 module and sometimes operation takes too many time on an extremely slow sites. That's why I want to terminate such slow threads
Since abruptly killing a thread that's in a blocking call is not feasible, a better approach, when possible, is to avoid using threads in favor of other multi-tasking mechanisms that don't suffer from such issues.
For the OP's specific case (the threads' job is to download web pages, and some threads block forever due to misbehaving sites), the ideal solution is twisted -- as it generally is for networking tasks. In other cases, multiprocessing might be better.
More generally, when threads give unsolvable issues, I recommend switching to other multitasking mechanisms rather than trying heroic measures in the attempt to make threads perform tasks for which, at least in CPython, they're unsuitable.
As Alex Martelli suggested, you could use the multiprocessing module. It is very similar to the Threading module so that should get you off to a start easily. Your code could be like this for example:
import multiprocessing
def get_page(*args, **kwargs):
# your web page downloading code goes here
def start_get_page(timeout, *args, **kwargs):
p = multiprocessing.Process(target=get_page, args=args, kwargs=kwargs)
p.start()
p.join(timeout)
if p.is_alive():
# stop the downloading 'thread'
p.terminate()
# and then do any post-error processing here
if __name__ == "__main__":
start_get_page(timeout, *args, **kwargs)
Of course you need to somehow get the return values of your page downloading code. For that you could use multiprocessing.Pipe or multiprocessing.Queue (or other ways available with multiprocessing). There's more information, as well as samples you could check here.
Lastly, the multiprocessing module is included in python 2.6. It is also available for python 2.5 and 2.4 at pypi (you can use easy_install multiprocessing) or just visit pypi and download and install the packages manually.
Note: I realize this has been posted awhile ago. I was having a similar problem to this and stumbled here and saw Alex Martelli's suggestion. Had it implemented for my problem and decided to share it. (I'd like to thank Alex for pointing me in the right direction.)
Use synchronization objects and ask the thread to terminate. Basically, write co-operative handling of this.
If you start yanking out the thread beneath the python interpreter, all sorts of odd things can occur, and it's not just in Python either, most runtimes have this problem.
For instance, let's say you kill a thread after it has opened a file, there's no way that file will be closed until the application terminates.
If you are trying to kill a thread whose code you do not have control over, it depends if the thread is in a blocking call or not. In my experience if the thread is properly blocking, there is no recommended and portable way of doing this.
I've run up against this when trying to work with code in the standard library (multiprocessing.manager I'm looking at you) with loops coded with no exit condition: nice!
There are some interuptable thread implementations out there (see here for an example), but then, if you have the control of the threaded code yourself, you should be able to write them in a manner where you can interupt them with a condition variable of some sort.

Categories

Resources