Configure Apache to recover from mod_python errors [duplicate]

Configure Apache to recover from mod_python errors [duplicate] - python

This question already has an answer here:
Testing for mysterious load errors in python/django
(1 answer)
Closed 3 years ago.
I am hosting a Django app on Apache using mod_python. Occasionally, I get some cryptic mod_python errors, usually of the ImportError variety, although not usually referring to the same module. The thing is, these seem to come up for a single forked subprocess, while the others operate fine, even when I force behavior that requires using the module that the problem process has errored on. Once the process encounters the error, it will always just serve the same traceback every time Apache chooses it to handle a request. (This is also a hassle, since my users don't necessarily report the error on the first occurrence, and once the process encounters the error.)
I know more about configuring Django than configuring Apache, but that won't get me anywhere since the request never reaches Django for processing. Ideally, I should solve the root problem, and that might involve my code, project, or machine configuration, but until then, I need help diagnosing and mitigating the problem.
Is there any way to configure the Apache logs to include a subprocess id?
Is there any way to force a subprocess to respawn if it has hit an error?
Are there any known issues relating to this that I should know about?

As a workaround, and assuming you are free to install new Apache modules on the server, you might try one of
mod_scgi
mod_fastcgi
mod_wsgi
instead. I use SCGI to connect an nginx frontend webserver to my Django apps, which highlights a major benefit (decoupling from the webserver). All of these packages are available in Debian, probably on RHELx as well.

Related

Is apache2 reload for .conf changes only or is it allowable to be used when application code changes?

When the code for my python WSGI applicaiton changes should I use apache2's reload or graceful restart feature?
Currently we use reload, but have noticed that sometimes the application does not load properly and errors pertaining to missing modules are logged to the error files even though the modules have existed for a long time.

If you can, you should probably use graceful. But if your application is not exiting correctly you may have to force it with just restart.
For wsgi, you should try running in daemon mode. When it is running in daemon mode, you can restart your service just by touching the wsgi file and updating its timestamp. This will reload all the code without restarting apache.
Here is more info: http://code.google.com/p/modwsgi/wiki/ConfigurationDirectives#WSGIDaemonProcess
This is for django, but may be useful for your project: http://code.google.com/p/modwsgi/wiki/IntegrationWithDjango

The 'reload' and 'graceful' would have the same effect as far as reloading your web application. If you are seeing issues with imports like you describe, it is likely to be an issue in your application code with you having import order dependencies or import cycles. One sees this a lot with people using Django. Suggest you actually post an example of the error you are getting.

Problem with Django using Apache2 (mod_wsgi), Occassionally is "unable to import from module" for no apparent reason

I have put my Django web site up to my web server and have it set up using apache2 and mod_wsgi.. everything works fine most of the time but occasionally it will just give the error that it can't import a module (usually from my views file). However, it's not an issue with that module as it usually works, for example, I will get the error "Cannot import classname from module" once, then reload the page and it works fine, I would say it's about 1 in 10 page loads where this occurs and it's just random as it will happen for any page on my site.
I have tried restarting apache2, restarting the server but the issue persists. I have tried it on different client machines, clearing out the user cache, etc but the issue persists. I don't know what might be doing this, would perhaps some sort of caching help prevent this as it seems that the server is just having an issue with sometimes not being able to fully process the request. I am using a cloud set up with not much memory on the server so maybe this is the problem? Any advice is appreciated

It is working most of the time because you likely have a multi process configuration and only one of the processes is affected.
You can try alternate WSGI script file as documented in:
http://blog.dscpl.com.au/2010/03/improved-wsgi-script-for-use-with.html
The jury is still out as to whether the issue is the differences between development server and proper deployment systems using WSGI, or whether it is users not handling imports properly and causing order dependencies or even import cycles. Problems possibly only come up when URL visited in certain order and thus why random as to when it can happen.

Python web hosting: Why are server restarts necessary?

We currently run a small shared hosting service for a couple of hundred small PHP sites on our servers. We'd like to offer Python support too, but from our initial research at least, a server restart seems to be required after each source code change.
Is this really the case? If so, we're just not going to be able to offer Python hosting support. Giving our clients the ability to upload files is easy, but we can't have them restart the (shared) server process!
PHP is easy -- you upload a new version of a file, the new version is run.
I've a lot of respect for the Python language and community, so find it hard to believe that it really requires such a crazy process to update a site's code. Please tell me I'm wrong! :-)

Python is a compiled language; the compiled byte code is cached by the Python process for later use, to improve performance. PHP, by default, is interpreted. It's a tradeoff between usability and speed.
If you're using a standard WSGI module, such as Apache's mod_wsgi, then you don't have to restart the server -- just touch the .wsgi file and the code will be reloaded. If you're using some weird server which doesn't support WSGI, you're sort of on your own usability-wise.

Depends on how you deploy the Python application. If it is as a pure Python CGI script, no restarts are necessary (not advised at all though, because it will be super slow). If you are using modwsgi in Apache, there are valid ways of reloading the source. modpython apparently has some support and accompanying issues for module reloading.
There are ways other than Apache to host Python application, including the CherryPy server, Paste Server, Zope, Twisted, and Tornado.
However, unless you have a specific reason not to use it (an since you are coming from presumably an Apache/PHP shop), I would highly recommed mod_wsgi on Apache. I know that Django recommends modwsgi on Apache and most of the other major Python frameworks will work on modwsgi.

Is this really the case?
It Depends. Code reloading is highly specific to the hosting solution. Most servers provide some way to automatically reload the WSGI script itself, but there's no standardisation; indeed, the question of how a WSGI Application object is connected to a web server at all differs widely across varying hosting environments. (You can just about make a single script file that works as deployment glue for CGI, mod_wsgi, passenger and ISAPI_WSGI, but it's not wholly trivial.)
What Python really struggles with, though, is module reloading. Which is problematic for WSGI applications because any non-trivial webapp will be encapsulating its functionality into modules and packages rather than simple standalone scripts. It turns out reloading modules is quite tricky, because if you reload() them one by one they can easily end up with bad references to old versions. Ideally the way forward would be to reload the whole Python interpreter when any file is updated, but in practice it seems some C extensions seem not to like this so it isn't generally done.
There are workarounds to reload a group of modules at once which can reliably update an application when one of its modules is touched. I use a deployment module that does this (which I haven't got around to publishing, but can chuck you a copy if you're interested) and it works great for my own webapps. But you do need a little discipline to make sure you don't accidentally start leaving references to your old modules' objects in other modules you aren't reloading; if you're talking loads of sites written by third parties whose code may be leaky, this might not be ideal.
In that case you might want to look at something like running mod_wsgi in daemon mode with an application group for each party and process-level reloading, and touch the WSGI script file when you've updated any of the modules.
You're right to complain; this (and many other WSGI deployment issues) could do with some standardisation help.

Common errors when moving a django app from dev to prod?

I am developping a django app on Windows, SQLite and the django dev server . I have deployed it to my host server which is running Linux, Apache, FastCgi, MySQL.
Unfortunately, I have an error returned by the server on the prod while everything ok on the dev machine. I've asked my provider for a pre-production solution in order to be able to debug and understand the problem.
Anyway, what are according to you the most likely errors that can happen when moving a django app from dev to prod?
Best
Update: I think that a pre-prod is the best way to address this kind of problem. But I would like to build a check list of what must be done before to put in production.
Thanks for the very valuable answers that I received until now :)
Update: FYI, I 've implemented the preprod server and the email notification as suggested by shanyu and I can see that the error comes from the smart_if templatetag that I am using on this new version. Any trick with template tags?
Update: I think I've fixed the pb which was caused I think by the Filezilla FTP sending. I was using the "replace if newer" option which I guess is causing some unexpected results. Using the "replace all" option fix the issue. However, it was an opportunity for me to learn more about deployment. Thansk for your answers.

Problems I typically have include:
Misconfigured productions settings, whether in my production localsettings.py, wsgi/cgi, or apache site files in /etc/sites-available
Database differences. I use South for migrations and have run into some subtle issues when performing my migration on PostgreSQL when it worked smoothly in sqlite.
Static file hosting since I cheat and use the Django server in development
Permissions, both on the file system and within the database
Rare, but possible, network issues preventing me from getting my dependencies, whether on PyPi or some 3rd party site
Ways that I have mitigated these issues:
Use the same database in production and development (in your case, MySQL everywhere)
I've found it is useful to have a "test" environment which mimics production in every way possible (it can be on lower end hardware, or even the same machine). This way, if there are any issues in this "production-like" enivornment, I can solve them without taking my production server offline.
Script everything for repeatable deployments. I use fabric, but zc.buildout or Paver would also work. These tools help reduce typos while deploying and reduce the time to deploy my app.
Use version control (mercurial, git, subversion) and a schema migration tool (like South), so if something does go wrong when you deploy to production, you have the possibility of backing out the changes and allowing production to run on the old code with the old database schema.
I haven't set up an "egg proxy" yet, but I am considering it, to avoid issues when downloading dependencies.
I've found pip's freezing dependencies to be useful, in case a new, incompatible change to a library occurred since I downloaded it initially
Use a web testing framework like Windmill or Selenium to test my application in my "test" environment, so that I can get a lot of test coverage of my system very quickly.

Regarding your case, I can think of 2 simple things that may help you:
You can enable Django to send messages when exceptions occur giving details about them. Look at here for details.
You'll be better off if you set up a test environment on the prod server (say, test.example.com) so that you can check if things will go smoothly or not before you deploy the app.

I believe these were the podcasts I listened to recently (from Pycon 2009):
Locate Django in the Real World (PyCon 2009):
http://advocacy.python.org/podcasts/pycon.rss
Parts 1 to 3
Very good introduction to designing your apps for deployment, in particular for reuse and redeployment.
Regs.

Cleanest & Fastest server setup for Django [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 8 years ago.
Improve this question
I'm about to deploy a mediumsized site powered by Django. I have a dedicated Ubuntu Server.
I'm really confused over which serversoftware to use. So i thought to myself: why not ask stackoverflow.
What i'm looking for is:
Easy to set up
Fast and easy on resources
Can serve mediafiles
Able to serve multiple djangosites on same server
I would rather not install PHP or anything else that sucks resources, and for which I have no use for.
I have heard of mod_wsgi and mod_python on Apache, nginx and lighty. Which are the pros and cons of these and have i missed someone?
#Barry: Somehow i feel like Apache is to bloated for me. What about the alternatives?
#BrianLy: Ok I'll check out mod_wsgi some more. But why do i need Apache if i serve static files with lighty? I have also managed to serve the django app itself with lighty. Is that bad in anyway? Sorry for beeing so stupid :-)
UPDATE: What about lighty and nginx - which are the uses-cases when these are the perfect choice?

Since I was looking for some more in-depth answers, I decided to research the issue myself in depth. Please let me know if I've misunderstood anything.
Some general recommendation are to use a separate webserver for handling media. By separate, I mean a webserver which is not running Django. This server can be for instance:
Lighttpd (Lighty)
Nginx (EngineX)
Or some other light-weight server
Then, for Django, you can go down different paths. You can either:
Serve Django via Apache and:
mod_python
This is the stable and recommended/well documented way. Cons: uses a lot of memory.
mod_wsgi
From what I understand, mod_wsgi is a newer alternative. It appears to be faster and easier on resources.
mod_fastcgi
When using FastCGI you are delegating the serving of Django to another process. Since mod_python includes a python interpreter in every request it uses a lot of memory. This is a way to bypass that problem. Also there is some security concerns.
What you do is that you start your Django FastCGI server in a separate process and then configures apache via rewrites to call this process when needed.
Or you can:
Serve Django without using Apache but with another server that supports FastCGI natively:
(The documentation mentions that you can do this if you don't have any Apache specific needs. I guess the reason must be to save memory.)
Lighttpd
This is the server that runs Youtube. It seems fast and easy to use, however i've seen reports on memoryleaks.
nginx
I've seen benchmarks claiming that this server is even faster than lighttpd. It's mostly documented in Russian though.
Another thing, due to limitations in Python your server should be running in forked mode, not threaded.
So this is my current research, but I want more opinions and experiences.

I'm using Cherokee.
According to their benchmarks (grain of salt with them), it handles load better than both Lighttpd and nginx... But that's not why I use it.
I use it because if you type cherokee-admin, it starts a new server that you can log into (with a one-time password) and configure the whole server through a beautifully-done webmin. That's a killer feature. It has already saved me a lot of time. And it's saving my server a lot of resources!
As for django, I'm running it as a threaded SCGI process. Works well. Cherokee can keep it running too. Again, very nice feature.
The current Ubuntu repo version is very old so I'd advise you use their PPA. Good luck.

As #Barry said, the documentation uses mod_python. I haven't used Ubuntu as a server, but had a good experience with mod_wsgi on Solaris. You can find documentation for mod_wsgi and Django on the mod_wsgi site.
A quick review of your requirements:
Easy to setup I've found apache 2.2 fairly easy to build and install.
Fast and easy on resources I would say that this depends on your usage and traffic. * You may not want to server all files using Apache and use LightTPD (lighty) to server static files.
Can serve media files I assume you mean images, flash files? Apache can do this.
Multiple sites on same server Virtual server hosting on Apache.
Rather not install other extensions Comment out anything you don't want in the Apache config.

The officially recommended way to deploy a django project is to use mod_python with apache. This is described in the documentation. The main pro with this is that it is the best documented, most supported, and most common way to deploy. The con is that it probably isn't the fastest.

The best configuration is not so known I think. But here is:
Use nginx for serving requests (dynamic to app, static content directly).
Use python web server for serving dynamic content.
Two most speedy solutions for python-based web server is:
cogen
fapws2
You need to look into google to find current best configuration for django (still in development).

I’m using nginx (0.6.32 taken from Sid) with mod_wsgi. It works very well, though I can’t say whether it’s better than the alternatives because I never tried any. Nginx has memcached support built in, which can perhaps interoperate with the Django caching middleware (I don’t actually use it, instead I fill the cache manually using python-memcache and invalidate it when changes are made), so cache hits completely bypass Django (my development machine can serve about 3000 requests per second).
A caveat: nginx’ mod_wsgi highly dislikes named locations (it tries to pass them in SCRIPT_NAME), so the obvious ‘error_page 404 = #django’ will cause numerous obscure errors. I had to patch mod_wsgi source to fix that.

I'm struggling to understand all the options as well. In this blog post I found some benefits of mod_wsgi compared to mod_python explained.
Multiple low-traffic sites on a small VPS make RAM consumption the primary concern, and mod_python seems like a bad option there. Using lighttpd and FastCGI, I've managed to get the minimum memory usage of a simple Django site down to 58MiB virtual and 6.5MiB resident (after restarting and serving a single non-RAM-heavy request).
I've noticed that upgrading from Python 2.4 to 2.5 on Debian Etch increased the minimum memory footprint of the Python processes by a few percent. On the other hand, 2.5's better memory management might have a bigger opposite effect on long-running processes.

Keep it simple: Django recommends Apache and mod_wsgi (or mod_python). If serving media files is a very big part of your service, consider Amazon S3 or Rackspace CloudFiles.

There are many ways, approach to do this.For that reason, I recommend to read carefully the article related to the deployment process on DjangoAdvent.com:
Eric Florenzano - Deploying Django with FastCGI: http://djangoadvent.com/1.2/deploying-django-site-using-fastcgi/
Read too:
Mike Malone - Scaling Django
Stochastictechnologies Blog: The perfect Django Setup
Mikkel Hoegh Blog: 35 % Response-time-improvement-switching-uwsgi-nginx
Regards

In my opinion best/fastest stack is varnish-nginx-uwsgi-django.
And I'm successfully using it.

If you're using lighthttpd, you can also use FastCGI for serving Django. I'm not sure how the speed compares to mod_wsgi, but if memory serves correctly, you get a couple of the benefits that you would get with mod_wsgi that you wouldn't get with mod_python. The main one being that you can give each application its own process (which is really helpful for keeping memory of different apps separated as well as for taking advantage of multi-core computers.
Edit: Just to add in regards to your update about nginix, if memory serves correctly again, nginix uses "greenlets" to handle concurrency. This means that you may need to be a little bit more careful to make sure that one app doesn't eat up all the server's time.

We use nginx and FastCGI for all of our Django deployments. This is mostly because we usually deploy over at Slicehost, and don't want to donate all of our memory to Apache. I guess this would be our "use case".
As for the remarks about the documentation being mostly in Russian -- I've found most of the information on the English wiki to be very useful and accurate. This site has sample configurations for Django too, from which you can tweak your own nginx configuration.

I have a warning for using Cherokee. When you make changes to Django Cherokee maintains the OLD process, instead of killing it and starting a new one.
On Apache i strongly recommend this article.
http://www.djangofoo.com/17/django-mod_wsgi-deploy-exampl
Its easy to setup, easy to kill or reset after making changes.
Just type in terminal
sudo /etc/init.d/apache2 restart
and changes are seen instantly.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.