Downtime when reloading mod_wsgi daemon?

Downtime when reloading mod_wsgi daemon? - python

I'm running a Django application on Apache with mod_wsgi. Will there be any downtime during an upgrade?
Mod_wsgi is running in daemon mode, so I can reload my code by touching the .wsgi script file, as described in the "ReloadingSourceCode" document: http://code.google.com/p/modwsgi/wiki/ReloadingSourceCode. Presumably, that reload requires some non-zero amount of time. What happens if a request comes in during the reload? Will Apache queue the request and then complete it once the wsgi daemon is ready?
The documentation includes the following statement:
So, if you are using Django in daemon mode and needed to change your 'settings.py' file, once you have made the required change, also touch the script file containing the WSGI application entry point. Having done that, on the next request the process will be restarted and your Django application reloaded.
To me, that suggests that Apache will gracefully handle every request, but I thought I would ask to be sure. My app isn't critical (a little downtime wouldn't be disastrous) so the question is mostly academic.
Thank you.

In daemon mode there is no concept of a graceful restart when WSGI script file is touched to force a download. That is, unlike Apache itself, which will start new Apache server child processes while waiting for old processes to finish up with current requests, for mod_wsgi daemon processes, the existing process must exit before a new one starts up.
The consequences of this are that mod_wsgi can't wait indefinitely for current requests to complete. If it did, then there is a risk that if all daemon processes are tied up waiting for current requests to finish, that clients would see a noticeable delay in being handled.
At the other end of the scale however, the daemon process can't be immediately killed as that would cause current requests to be interrupted.
A middle ground therefore exists. The daemon process will wait for requests to finish before exiting, but if they haven't completed within the shutdown period, then the daemon process will be forcibly quit and the active requests will be interrupted.
The period of this shutdown timeout defaults to 5 seconds. It can be overridden using the shutdown-timeout option to WSGIDaemonProcess directive, but due consideration should be given to the effects of changing it.
Thus, in respect of this specific issue, if you have long running requests still active when the first request comes in after you touched the WSGI script file, there is the risk that the active long requests will be interrupted.
The next notable thing you may see is that even if there are no long running requests and processes shutdown promptly, then it is still necessary to load up the WSGI application again within the new process. The time this takes will be seen as a delay in handling the request. How big that delay is will depend on the framework and your application. The worst offender as far as time taken to start up that I know of is TurboGears. Django somewhat better and the best as far as quick start up times being lightweight micro frameworks such as Flask.
Do note that any new requests which come in while these shutdown and startup delays occur should not be lost. This is because the HTTP listener socket has a certain depth and connections queue up in that waiting to be accepted. If the number of requests arriving is huge though and that queue fills up, then you will start to see connection refused errors in the browser.

No, there will be no downtime. Requests using the old code will complete, and new requests will use the new code.
There will be a small bit more load on the server as the new code loads but unless your application is colossal and your servers are already nearly overloaded this will be unnoticeable.
This is like the apachectl graceful command for Apache as a whole, which tells it to start a new configuration without downtime.

Related

Debugging what uWSGI worker is doing

I have a Django application (API) running in production served by uWSGI, which has 8 processes (workers) running. To monitor them I use uwsgitop. Every day from time to time one worker falls into the BUSY state and stays for like five minutes and consumes all of the memory and kills the whole instance. The problem is, I do not know how to debug what the worker is doing at the particular moment or what function is it executing. Is there a fast and a proper way to find out the function and the request that it is handling?

One can send signal SIGUSR2 to a uwsgi worker, and the current request is printed into the log file, along with a native (sadly not Python) backtrace.

Flask and watchdog: multiple threads starting

I am writing an application using Flask which monitors a filesystem for updates and logs them. My startup sequence (in debug mode) is:
Create Flask application object
Start Watchdog
Start the application
When running in debug mode, the application automatically restarts with werkzeug's fsevents reloader, which is normal; however, this restart does not terminate the first watchdog thread, and so at this point there is a second watchdog thread, causing every filesystem event to be duplicated.
This doesn't occur in production, but it is impacting my debugging and makes me worry that I am doing something wrong with starting up watchdog. Is there something I should be doing in order to make watchdog exit cleanly, or some way to prevent it from starting up a second time?
Also, when the application does restart due to a code edit, the second watchdog thread does restart correctly; it is only the first watchdog that starts before the initial reload that doesn't shut down on reload.

Rather than starting the background thread before the application starts, it is cleaner and safer to start the thread using app.before_first_request. The downside to this is that the background thread won't start up until the first request comes in.

What happens when you have an infinite loop in Django view code?

Something that I just thought about:
Say I'm writing view code for my Django site, and I make a mistake and create an infinite loop.
Whenever someone would try to access the view, the worker assigned to the request (be it a Gevent worker or a Python thread) would stay in a loop indefinitely.
If I understand correctly, the server would send a timeout error to the client after 30 seconds. But what will happen with the Python worker? Will it keep on working indefinitely? That sounds dangerous!
Imagine I've got a server in which I've allocated 10 workers. I let it run and at some point, a client tries to access the view with the infinite loop. A worker will be assigned to it, and will be effectively dead until the next server restart. The dangerous thing is that at first I wouldn't notice it, because the site would just be imperceptibly slower, having 9 workers instead of 10. But then it might happen again and again throughout a long span of time, maybe months. The site would just get progressively slower, until eventually it would be really slow with just one worker.
A server restart would solve the problem, but I'd hate to have my site's functionality depend on server restarts.
Is this a real problem that happens? Is there a way to avoid it?
Update: I'd also really appreciate a way to take a stacktrace of the thread/worker that's stuck in an infinite loop, so I could have that emailed to me so I'll be aware of the problem. (I don't know how to do this because there is no exception being raised.)
Update to people saying things to the effect of "Avoid writing code that has infinite loops": In case it wasn't obvious, I do not spend my free time intentionally putting infinite loops into my code. When these things happen, they are mistakes, and mistakes can be minimized but never completely avoided. I want to know that even when I make a mistake, there'll be a safety net that will notify me and allow me to fix the problem.

It is a real problem. In case of gevent, due to context switching, it can even immediately stop your website from responding.
Everything depends on your environment. For example, when running django in production through uwsgi you can set harakiri - that is time in seconds, after which thread handling the request will be killed if it didn't finish handling the response. It is strongly recommended to set such a value in order to deal with some faulty requests or bad code. Such event is reported in uwsgi log. I believe other solutions for running Django in production have similar options.
Otherwise, due to network architecture, client disconnection will not stop the infinite loop, and by default there will be no response at all - just infinite loading. Various timeout options (one of which harakiri is) may end up showing connection timeout - for example, php has (as far as i remember) default timeout of 30 seconds and it will return 504 gateway timeout. Socket disconnection timeout depends on http server settings and it will not stop application thread, it will only close client socket.
If not using gevent (or any other green threads), infinite loop will tend to take up 100% of available CPU power (limited to one core), possibly eating up more and more memory, so your website will work pretty slow and/or timeout really quick. Django itself is not aware of request time, so - as mentioned before - your production environment stack is the way to prevent this from happening. In case of uwsgi, http://uwsgi-docs.readthedocs.org/en/latest/Options.html#harakiri-verbose is the way to go.
Harakiri does print stack trace of the killed proces: (https://uwsgi-docs.readthedocs.org/en/latest/Tracebacker.html?highlight=harakiri) straight to uwsgi log, and due to alarm system you can get notified through e-mail (http://uwsgi-docs.readthedocs.org/en/latest/AlarmSubsystem.html)

I just tested this on Django's development server.
Results:
Does not give a timeout after 30 seconds. (this might because its not a production server though)
Stays in loading until i close the page.
I guess one way to avoid it, without actually just avoiding a code like that, would be to use threading to have control of timeouts and be able to stop the thread.
Maybe something like:
import threading
from django.http import HttpResponse
class MyThread(threading.Thread):
def __init__(self):
threading.Thread.__init__(self)
def run(self):
print "your possible infinite loop code here"
def possible_loop_view(request):
thread = MyThread()
thread.start()
return HttpResponse("html response")

Yes, your analysis is correct. The worker thread/process will keep running. Moreover, if there is no wait/sleep in the loop, it will hog the CPU. Other threads/process will get very little cpu, resulting your entire site on slow response.
Also, I don't think server will send any timeout error to client explicitly. If the TCP timeout is set, TCP connection will be closed.
Client may also have some timeout setting to get response, which may come into picture.
Avoiding such code is best way to avoid such code. You can also have some monitoring tool on server to look for CPU/memory usage and notify for abnormal activity so that you can take action.

Apache + mod_wsgi interaction

Before posting this, I have read quite a few resources online, including the mod_wsgi wiki, but I am confused about how exactly Apache processes/threads interact with mod_wsgi.
This is my current understanding: Apache can be configured to run such that one or more child processes can handle incoming requests, and each of these child processes can be configured to in turn use one or more threads to service requests. After that, things start getting hazy for me. My doubts are:
What is a WSGIDaemonProcess, and who actually calls my Django app using the python sub interpreter?
If I have my Django app running under a mode where multiple threads are allowed in a single Apache child process - does that mean that multiple requests could be simultaneously accessing my app at the same time? If so - would doing something like setting a module level variable (say that of an user's ID) could be over-written by other parallel requests and lead to non-thread safe behavior?
For the case above, with Python's global interpreter lock, would the threads actually be executing in parallel?

Answers to each of the points.
1 - WSGIDaemonProcess/WSGIProcessGroup indicate that mod_wsgi should fork of a separate process for running the WSGI application in. This is a fork only and not a fork/exec, so mod_wsgi is still in control of it. When it is detected that a URL maps to a WSGI application running in daemon mode, then mod_wsgi code in the Apache child worker processes will proxy the request details through to the daemon mode process where the mod_wsgi code there reads it and calls up into your WSGI application.
2 - Yes, multiple requests can be operating concurrently and be wanting to modify the module global data at the same time.
3 - For the time that execution is within Python itself then no, they aren't strictly running in parallel as the global interpreter lock means that only one thread can be executing Python code at a time. The Python interpreter will periodically switch which thread is getting to run. If one of the threads calls into C code and releases the GIL then at least for the time that thread is in that state it can run in parallel to other threads, running in Python or C code. As example, when calls are made down into the Apache/mod_wsgi layer to write back response data, the GIL is released. This means that the actual writing back of response data at the lower layers is not prevent other threads from running.

Killing individual Apache processes in mod_python

We are having a problem with individual apache processes utilizing large amounts of memory, depending on the request, and never releasing it back to the main system. Since these requests can happen at any time, over time the web server is pushed into swap, rendering it unresponsive even to SSH. Worse, after the request has finished, Python fails to release the memory back into the wild, which results in a number 500mb - 1gb Apache processes lying around.
We push very few requests per second, but each request has the potential to be very heavy.
What I would like to do is have a way to kill an individual apache process child after it has finished serving a request if its resident memory exceeds a certain threshold. I have tried several ways of actually doing this inside mod_python, but it appears that any form of system exit results in the response not completing to the client.
Outside of gracefuling all the processes (which we really want to avoid) whenever this happens, is there anyway to tell Apache to arbitrarily kill off a process after it has finished serving a request? All ideas are welcome.
As an additional caveat, due to the legacy nature of the system, we can’t upgrade to a later version of Python, so we can’t utilize the improved memory performance of 2.5. Similarly, we are stuck with our current OS.
Versions:
System: Red Hat Enterprise 4
Apache: 2.0.55
Python: 2.3.5

I'd say that even it is possible, it would be a tremendous hack (and instable) - you should set-up a process external to apache in this case, that would supervise running processes and kill an individual apache when it goes beyond memory/time predefined limits.
Such a script can be kept running continuously with a mainloop that is performs it's checks every few seconds, or could even be put in crontab to run every minute.
I see no reason to try to that from inside the serving processes themselves.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.