Python:Django: Signal handler and main thread

Python:Django: Signal handler and main thread - python

I am building a django application which depends on a python module where a SIGINT signal handler has been implemented.
Assuming I cannot change the module I am dependent from, how can I workaround the "signal only works in main thread" error I get integrating it in Django ?
Can I run it on the Django main thread?
Is there a way to inhibit the handler to allow the module to run on non-main threads ?
Thanks!

Django's built-in development server has auto-reload feature enabled by default which spawns a new thread as a means of reloading code. To work around this you can simply do the following, although you'd obviously lose the convenience of auto-reloading:
python manage.py runserver --noreload
You'll also need to be mindful of this when choosing your production setup. At least some of the deployment options (such as threaded fastcgi) are certain to execute your code outside main thread.

I use Python 3.5 and Django 1.8.5 with my project, and I met a similar problem recently. I can easily run my xxx.py code with SIGNAL directly, but it can't be executed on Django as a package just because of the error "signal only works in main thread".
Firstly, runserver with --noreload --nothreading is usable but it runs my multi-thread code too slow for me.
Secondly, I found that code in __init__.py of my package ran in the main thread. But, of course, only the main thread can catch this signal, my code in package can't catch it at all. It can't solve my problem, although, it may be a solution for you.
Finally, I found that there is a built-in module named subprocess in Python. It means you can run a sub real complete process with it, that is to say, this process has its own main thread, so you can run your code with SIGNAL easily here. Though I don't know the performance with using it, it works well for me. PS, you can find all details about subprocess in Python Documentation.
Thank you~

There is a cleaner way, that doesn't break your ability to use threads and processes.
Put your registration calls in manage.py:
def handleKill(signum, frame):
print "Killing Thread."
# Or whatever code you want here
ForceTerminate.FORCE_TERMINATE = True
print threading.active_count()
exit(0)
if __name__ == "__main__":
os.environ.setdefault("DJANGO_SETTINGS_MODULE", "mysite.settings")
from django.core.management import execute_from_command_line
signal.signal(signal.SIGINT, handleKill)
signal.signal(signal.SIGTERM, handleKill)
execute_from_command_line(sys.argv)

Although the question does not describe exactly the situation you are in, here is some more generic advice:
The signal is only sent to the main thread. For this reason, the signal handler should be in the main thread.
From that point on, the action that the signal triggers, needs to be communicated to the other threads. I usually do this using Events. The signal handler sets the event, which the other threads will read, and then realize that action X has been triggered. Obviously this implies that the event attribute should be shared among the threads.

Related

Can Python multiprocessing process be launched from OTHER than main?

I am writing a framework in Python 3.8 on Debian that will launch some multiprocessing processes. I want the configuration and launching of the processes to be done in functions OTHER than the main. The main file will be written by the end user of the framework and they should not need to know about these processes. Hence I tried to put the code that configures and launches the processes in helper functions or class methods that the main will call.
What I'm finding is as soon as the launcher function / method exits the processes die. This is even though the launcher functions / methods run (I think) in the same process as main which is still running. I have put a long time.sleep in the launcher functions / methods right before they exit and it seems the processes are alive for that long.
I tried setting the 'daemon' flag but that doesn't seem to solve it. If this is truly a limitation of multiprocessing I can instruct the users of my framework to always put some boiler-plate launcher code in their file, but it seems clunky. All help is appreciated!

The function that launches the processes was returning only the processes to main and NOT managed queues that the processes were using. I changed to return a dict with the processes AND the managed queues and everything works. This is even though main does NOT use the queues.

Django: How to run a function when server exits?

I am writing a Django project where several processes are opened using Popen. Right now, when the server exits, these processes are orphaned. I have a function to terminate these processes, and I wish to organise it so that this function is called automatically when the server quits.
Any help would be greatly appreciated.

Since you haven't specified which HTTP server you are using (uWSGI, nginx, apache etc.), you can test this recipe out on a simple dev server.
What you can try is to register a cleanup function via atexit module that will be called at process termination. You can do this easily by overriding django's builtin runserver command.
Create a file named runserver.py and put that in $PATH_TO_YOUR_APP/management/commands/ directory.
Assuming PROCESSES_TO_KILL is a global list holding references to orphan processes that will be killed upon server termination.
import atexit
import signal
import sys
from django.core.management.commands.runserver import BaseRunserverCommand
class NewRunserverCommand(BaseRunserverCommand):
def __init__(self, *args, **kwargs):
atexit.register(self._exit)
signal.signal(signal.SIGINT, self._handle_SIGINT)
super(Command, self).__init__(*args, **kwargs)
def _exit(self):
for process in PROCESSES_TO_KILL:
process.terminate()
def _handle_SIGINT(signal, frame):
self._exit()
sys.exit(0)
Just be aware that this works great for normal termination of the script, but it won't get called in all cases (e.g. fatal internal errors).
Hope this helps.

First of all "When the server quits" is ambiguous. Does this stuff run when responding to a request? Does this stuff run during a management command?
Let's assume for the sake of argument, that you are running this somewhere in a view, so you want to have something that runs after each view returns in order to clean up junk that the view left hanging around.
Most likely, what you are looking to do is to write some Middleware. Even more specifically, some sort of process_response.
However, based on the short description of what you have so far, it sounds far more likely that you should be using some task manager, such as Celery to manage asynchronous tasks and processes.

What can cause python registered signals to be ignored?

I have a python script with multiple threading-launched threads in which several threads occasionally freeze (apparently simultaneously). In this script, I've registered a signal handler to dump stack traces from all the running threads. When it's frozen, no dumped stacks appear. What could be causing this?
A couple of possibilities that come to mind:
A thread is not releasing a mutex, freezing any other threads that attempt to acquire it. I would expect the signal handler to work in this case, however. Am I mistaken?
I log various things to stdout and stderrr, which are redirected with a bash command line to a log file. Perhaps precisely timed output from 2 threads could be blocking at OS level? This script has been running for months without problems, though there was a kernel update just recently (it's Ubuntu 12.04). If this is the case, the signal is not being ignored, just not producing any output...
I have a few global variables that are read and written by the freezing threads. I had thought that python 2.7 has a global thread lock to make this safe, and it's not been a problem before.

Python's signal module runs signal handlers on the main interpreter thread exclusively. If your main thread is hung and unable to execute Python code, your signal handlers will not run. Signals will be fired and caught, but if the thread can't execute Python code then nothing will happen.
The best way to avoid this situation is to ensure your main thread (the first thread that exists in the Python interpreter upon startup) does not deadlock. This may mean ensuring that nothing important happens on that thread after initialization.

Python signals hosted on WSGI

I'm using the python signals library to kill a function if it runs longer than a set period of time.
It works well in my tests but when hosted on the server I get the following error
"signal only works in main thread"
I have set the WSGI signals restriction to be off in my httpd.conf
WSGIRestrictSignal Off
as described
http://code.google.com/p/modwsgi/wiki/ApplicationIssues#Registration_Of_Signal_Handlers
I'm using the functions from the recipe described here
http://code.activestate.com/recipes/307871/
Not sure what I'm doing wrong. Is there a way to ensure that the signals are called in the main thread?

The only time any code under Apache/mod_wsgi runs as main thread is when a WSGI script file is being imported via WSGIImportScript or equivalent methods. Although one could use that method to register the signal handler from the main thread, it will be of no use as all subsequent requests are serviced via secondary threads and never the main thread. As a consequence, I don't think your signal handler will ever run as recollect that it can only run if the main thread is actually doing something, and that will not be the case as the main thread, will be blocked in Apache/mod_wsgi simply waiting for the process to be shutdown.
What is the operation doing that you are trying to kill? Doing it within context of web application, especially if web application is multi threaded probably isn't a good idea.

Twisted network client with multiprocessing workers?

So, I've got an application that uses Twisted + Stomper as a STOMP client which farms out work to a multiprocessing.Pool of workers.
This appears to work ok when I just use a python script to fire this up, which (simplified) looks something like this:
# stompclient.py
logging.config.fileConfig(config_path)
logger = logging.getLogger(__name__)
# Add observer to make Twisted log via python
twisted.python.log.PythonLoggingObserver().start()
# initialize the process pool. (child processes get forked off immediately)
pool = multiprocessing.Pool(processes=processes)
StompClientFactory.username = username
StompClientFactory.password = password
StompClientFactory.destination = destination
reactor.connectTCP(host, port, StompClientFactory())
reactor.run()
As this gets packaged for deployment, I thought I would take advantage of the twistd script and run this from a tac file.
Here's my very-similar-looking tac file:
# stompclient.tac
logging.config.fileConfig(config_path)
logger = logging.getLogger(__name__)
# Add observer to make Twisted log via python
twisted.python.log.PythonLoggingObserver().start()
# initialize the process pool. (child processes get forked off immediately)
pool = multiprocessing.Pool(processes=processes)
StompClientFactory.username = username
StompClientFactory.password = password
StompClientFactory.destination = destination
application = service.Application('myapp')
service = internet.TCPClient(host, port, StompClientFactory())
service.setServiceParent(application)
For the sake of illustration, I have collapsed or changed a few details; hopefully they were not the essence of the problem. For example, my app has a plugin system, the pool is initialized by a separate method, and then work is delegated to the pool using pool.apply_async() passing one of my plugin's process() methods.
So, if I run the script (stompclient.py), everything works as expected.
It also appears to work OK if I run twist in non-daemon mode (-n):
twistd -noy stompclient.tac
however, it does not work when I run in daemon mode:
twistd -oy stompclient.tac
The application appears to start up OK, but when it attempts to fork off work, it just hangs. By "hangs", I mean that it appears that the child process is never asked to do anything and the parent (that called pool.apply_async()) just sits there waiting for the response to return.
I'm sure that I'm doing something stupid with Twisted + multiprocessing, but I'm really hoping that someone can explain to my the flaw in my approach.
Thanks in advance!

Since the difference between your working invocation and your non-working invocation is only the "-n" option, it seems most likely that the problem is caused by the daemonization process (which "-n" prevents from happening).
On POSIX, one of the steps involved in daemonization is forking and having the parent exit. Among of things, this has the consequence of having your code run in a different process than the one in which the .tac file was evaluated. This also re-arranges the child/parent relationship of processes which were started in the .tac file - as your pool of multiprocessing processes were.
The multiprocessing pool's processes start off with a parent of the twistd process you start. However, when that process exits as part of daemonization, their parent becomes the system init process. This may cause some problems, although probably not the hanging problem you described. There are probably other similarly low-level implementation details which normally allow the multiprocessing module to work but which are disrupted by the daemonization process.
Fortunately, avoiding this strange interaction should be straightforward. Twisted's service APIs allow you to run code after daemonization has completed. If you use these APIs, then you can delay the initialization of the multiprocessing module's process pool until after daemonization and hopefully avoid the problem. Here's an example of what that might look like:
from twisted.application.service import Service
class MultiprocessingService(Service):
def startService(self):
self.pool = multiprocessing.Pool(processes=processes)
MultiprocessingService().setServiceParent(application)
Now, separately, you may also run into problems relating to clean up of the multiprocessing module's child processes, or possibly issues with processes created with Twisted's process creation API, reactor.spawnProcess. This is because part of dealing with child processes correctly generally involves handling the SIGCHLD signal. Twisted and multiprocessing aren't going to be cooperating in this regard, though, so one of them is going to get notified of all children exiting and the other will never be notified. If you don't use Twisted's API for creating child processes at all, then this may be okay for you - but you might want to check to make sure any signal handler the multiprocessing module tries to install actually "wins" and doesn't get replaced by Twisted's own handler.

A possible idea for you...
When running in daemon mode twistd will close stdin, stdout and stderr. Does something that your clients do read or write to these?

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Python:Django: Signal handler and main thread - python

Related

Can Python multiprocessing process be launched from OTHER than main?

Django: How to run a function when server exits?

What can cause python registered signals to be ignored?

Python signals hosted on WSGI

Twisted network client with multiprocessing workers?

Categories

Resources

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Python:Django: Signal handler and main thread - python

Related

Can Python multiprocessing process be launched from OTHER than __main__?

Django: How to run a function when server exits?

What can cause python registered signals to be ignored?

Python signals hosted on WSGI

Twisted network client with multiprocessing workers?

Categories

Resources

Can Python multiprocessing process be launched from OTHER than main?