Detect if flask is being run via gunicorn? - python

Is there a way I can check if my flask app is being run instide a gunicorn container? Currently I set an enviroment variable to tell my application this, but I'd prefer that it be automatic. Additionally, is there someway I can check what worker class is being used?
I need to detect this for a few different reasons. Note that typically I use gunicorn, but during testing I won't sometimes.
Logging: I attach to a gunicorn info log when run in gunicorn, otherwise to a stdout log.
Eventlet/subprocess: Since I use subprocesses I need to ensure that the proper monkey_patch'ing is done when using eventlet, otherwise it doesn't behave correctly. (I call many subprocesses).

Late to the party, but a very hacky solution that seems to work:
is_gunicorn = "gunicorn" in os.environ.get("SERVER_SOFTWARE", "")

Related

Does django-celery-beat deal with multiple instances of django processes created by web-servers

I have a fairly complex periodic-tasks that needs to be offloaded from django context. django-celery-beat looks promising. While I was going through celery-beat docs, I found this:
You have to ensure only a single scheduler is running for a schedule at a time, otherwise you’d end up with duplicate tasks. Using a centralized approach means the schedule doesn’t have to be synchronized, and the service can operate without using locks.
A typical production deployment will spawn a pool of worker-processes each running a django instance. Will that result in creation of multiple scheduler processes as well? Do I need to have some synchronisation logic?
Thanks for your time!
It does not.
You can dig into the issues page on their github repo for confirmation. I think it's weird that the documentation doesn't call this out, but I suppose you have to assume that's how all celery beats work unless they specify otherwise.
In theory, you could build your own synchronization, but it will probably be a better experience to use a different scheduler that has that functionality built in, like Heroku's redbeat: https://blog.heroku.com/redbeat-celery-beat-scheduler.

How can I get celery worker to only require broker_read_url

https://docs.celeryproject.org/en/master/userguide/configuration.html#broker-read-url-broker-write-url
When I specify only the broker_read_url in celeryconfig.py, for some reason celery falls back to the default amqp://localhost:5672
This doesn't make sense to me. Why would a worker need to write to the broker? Looking at the source code, I can't see anything besides the purge option which would require it.
Is there a way around this? I've tried putting in a dummy url, and that doesn't work.
edit:
More details about my setup. I've got a rmq shovel set up, which I'm writing to, and a rmq instance I'm reading from. In my producer app, I want to configure only broker_read_url, and in the consuming worker, I want to only configure broker_write_url.
Since I can't do this, my assumption is that celery must be using it for something, but I can't really tell what looking through the code.

Run code on first Django start

I have a Django application written to handle displaying a webpage with data from a model based on the primary key passed in the URL, this all works fine and the Django component is working perfectly for the most part.
My question though is, and I have tried multiple methods such as using an AppConfig, is how I can make it so when the Django server boots up, code is called that would then create a separate thread which would then monitor an external source, logging valid data from that source as a model into the database.
I have the threading code written along with the section that creates the model and saves it in the database, my issue though is that if I try to use an AppConfig to create the thread which would then handle the code, I get an django.core.exceptions.AppRegistryNotReady: Apps aren't loaded yet. error and the server does not boot up.
Where would be appropriate to place the code? Is my approach incorrect to the matter?
Trying to use threading to get around blocking processes like web servers is an exercise in pain. I've done it before and it's fragile and often yields unpredictable results.
A much easier idea is to create a separate worker that runs in a totally different process that you start separately. It would have the same database access and could even use your Django models. This is how hosts like Heroku approach this problem. It comes with the added benefit of being able to be tested separately and doesn't need to run at all while you're working on your main Django application.
These days, with a multitude of virtualization options like Vagrant and containerization options like Docker, running parallel processes and workers is trivial. In the wild they may literally be running on separate servers with your database on yet another server. As was mentioned in the comments, starting a worker process could easily be delegated to a separate Django management command. This, in turn, can be fairly easily turned into separate worker processes by gunicorn on your web server.

python web server and periodic tasks

I am using CherryPy to receive requests through REST API. Apart from handling requests the application should also do some resource management every few seconds. What is the easiest way to do this?
1) run a separate thread
2) cherrypy.process.plugins.PerpetualTimer (not sure how to use it, and it looks like it is heavy on resources?)
3) some other way?
The solution with a separate thread is fine by me, but I was wondering if there is a nicer way to do it?
Note that CherryPy is not a requirement - I have decided to use it primarily because the project looks alive and because it supports multiple simultaneous connections (in other words: I am open to alternatives).
PerpetualTimer is just a repeating version of threading._Timer.
What you really want to use is cherrypy.process.plugins.Monitor, which is little more than a way to run a separate thread for you. You should use it because it plugs into cherrypy.engine, which governs start and stop behavior for CherryPy servers. If you run your own thread, you're going to want to have it stop when CP shuts down anyway; the Monitor class already knows how to do that. It uses PerpetualTimer under the hood, until recent versions, where it was replaced by the BackgroundTask class.
my_task_runner = Monitor(cherrypy.engine, my_task, frequency=3)
my_task_runner.subscribe()

Replace AppEngine Devserver With Spawning (BaseHTTPRequestHandler as WSGI)

I'm looking to replace AppEngine's devserver with spawning. Spawning handles standard wsgi handlers, just like appengine, so running your app on it is easy.
But the devserver takes into account your app.yaml file that has url redirects etc. I've been going through the devserver code and it is pretty easy to get the BaseHTTPRequestHandler like this:
from google.appengine.tools.dev_appserver import CreateRequestHandler
dev = CreateRequestHandler(os.path.dirname(__file__), '', require_indexes=False, static_caching=True)
But the BaseHTTPRequestHandler is not a WSGI app, so my guess is I need to put something around it to make it work. Any hints?
I don't think you're going to be able to pull out a part of the dev_appserver and use it in a custom WSGI server quite so easily. The dev_appserver does a lot of 'magic', and it isn't really structured to be pulled out and used as a WSGI wrapper in another server (more's the pity).
You may want to check out TwistedAE, which is working on creating an alternate serving environment; if you really want to use spawning, you can probably use TwistedAE's work as a basis.
That said, if you do want to do it yourself, there's a couple of options:
You can write your own shim to interface WSGI with the class returned by CreateRequestHandler. In that case, you need to replicate the interface in BaseHTTPServer.BaseHTTPRequestHandler from the Python SDK. Converting WSGI to that, just so the dev_appserver code can convert it back seems a bit perverse, though.
You can rip out the code from the _HandleRequest method of DevAppServerRequestHandler, modify it to work with WSGI, and create a WSGI app from that (probably your best bet if you want to DIY).
You can start from scratch, which I believe is the approach taken by TwistedAE.
One thing to bear in mind whatever you do: App Engine explicitly expects a single-threaded environment for its apps. Don't use a multithreaded approach if you want apps to work the same locally as they do in production or on the dev_appserver!

Categories

Resources