Hot reloading properties in a Python Flask/Django app

Hot reloading properties in a Python Flask/Django app - python

Gurus, Wizards, Geeks
I am tasked with providing Python Flask apps (more generally, webapps written in python) a way to reload properties on the fly.
Specifically, my team and I currently deploy python apps with a {env}.properties file that contains various environment specific configurations in a key value format (yaml for instance). Ideally, these properties are reloaded by the app when changed. Suppose a secondary application existed that updates the previously mentioned {env}.properties file, the application should ALSO be able to read and use new values.
Currently, we read the {env}.properties at startup and the values are accessed via functions stored in a context.py. I could write a function that could periodically update the variables. Before starting an endeavor like this, I thought I would consult the collective to see if someone else has solved this for Django or Flask projects (as it seems like a reasonable request for feature flags, etc).

One such pattern is the WSGI application factory pattern.
In short, you define a function that instantiates the application object. This pattern works with all WSGI-based frameworks.
The Flask docs explain application factories pretty well.
This allows you to define the application dynamically on-the-fly, without the need to redeploy or deploy many configurations of an application. You can change just about anything about the app this way, including configuration, routes, middlewares, and more.
A simple example of this would be something like:
def get_settings(env):
"""get the (current, updated) application settings"""
...
return settings
def create_app(env: str):
if env not in ('dev', 'staging', 'production'):
raise ValueError(f'{env} is not a valid environment')
app = Flask(__name__)
app.config.update(get_settings(env))
return app
Then, you could set FLASK_APP environment variable to something like "myapp:create_app('dev')" and that would do it. This is also the same way you could specify this for servers like gunicorn.
The get_settings function should be written to return the newest settings. It could even do something like retrieve settings from an external source like S3, a config service, or anything.

Related

Why is App Engine Returning the Wrong Application ID?

The App Engine Dev Server documentation says the following:
The development server simulates the production App Engine service. One way in which it does this is to prepend a string (dev~) to the APPLICATION_IDenvironment variable. Google recommends always getting the application ID using get_application_id
In my application, I use different resources locally than I do on production. As such, I have the following for when I startup the App Engine instance:
import logging
from google.appengine.api.app_identity import app_identity
# ...
# other imports
# ...
DEV_IDENTIFIER = 'dev~'
application_id = app_identity.get_application_id()
is_development = DEV_IDENTIFIER in application_id
logging.info("The application ID is '%s'")
if is_development:
logging.warning("Using development configuration")
# ...
# set up application for development
# ...
# ...
Nevertheless, when I start my local dev server via the command line with dev_appserver.py app.yaml, I get the following output in my console:
INFO: The application ID is 'development-application'
WARNING: Using development configuration
Evidently, the dev~ identifier that the documentation claims will be preprended to my application ID is absent. I have also tried to use the App Engine Launcher UI to see if that changed anything, but it did not.
Note that 'development-application' is the name of my actual application, but I expected it to be 'dev~development-application'.

Google recommends always getting the application ID using get_application_id
But, that's if you cared about the application ID -- you don't: you care about the partition. Check out the source -- it's published at https://code.google.com/p/googleappengine/source/browse/trunk/python/google/appengine/api/app_identity/app_identity.py .
get_app_identity uses os.getenv('APPLICATION_ID') then passes that to internal function _ParseFullAppId -- which splits it by _PARTITION_SEPARATOR = '~' (thus removing again the dev~ prefix that dev_appserver.py prepended to the environment variable). That's returned as the "partition" to get_app_identity (which ignores it, only returning the application ID in the strict sense).
Unfortunately, there is no architected way to get just the partition (which is in fact all you care about).
I would recommend that, to distinguish whether you're running locally or "in production" (i.e, on Google's servers at appspot.com), in order to access different resources in each case, you take inspiration from the way Google's own example does it -- specifically, check out the app.py example at https://cloud.google.com/appengine/docs/python/cloud-sql/#Python_Using_a_local_MySQL_instance_during_development .
In that example, the point is to access a Cloud SQL instance if you're running in production, but a local MySQL instance instead if you're running locally. But that's secondary -- let's focus instead on, how does Google's own example tell which is the case? The relevant code is...:
if (os.getenv('SERVER_SOFTWARE') and
os.getenv('SERVER_SOFTWARE').startswith('Google App Engine/')):
...snipped: what to do if you're in production!...
else:
...snipped: what to do if you're in the local server!...
So, this is the test I'd recommend you use.
Well, as a Python guru, I'm actually slightly embarassed that my colleagues are using this slightly-inferior Python code (with two calls to os.getenv) -- me, I'd code it as follows...:
in_prod = os.getenv('SERVER_SOFTWARE', '').startswith('Google App Engine/')
if in_prod:
...whatever you want to do if we're in production...
else:
...whatever you want to do if we're in the local server...
but, this is exactly the same semantics, just expressed in more elegant Python (exploiting the second optional argument to os.getenv to supply a default value).
I'll be trying to get this small Python improvement into that example and to also place it in the doc page you were using (there's no reason anybody just needing to find out if their app is being run in prod or locally should ever have looked at the docs about Cloud SQL use -- so, this is a documentation goof on our part, and, I apologize). But, while I'm working to get our docs improved, I hope this SO answer is enough to let you proceed confidently.

That documentation seems wrong, when I run the commands locally it just spits out the name from app.yaml.
That being said, we use
import os
os.getenv('SERVER_SOFTWARE', '').startswith('Dev')
to check if it is the dev appserver.

Using url_for between two applications

I liked a lot of the conventions the Overholt example used, but ran into a specific problem.
I have two apps set up using the DispatcherMiddleware object from werkzeug.wsgi:
from werkzeug.wsgi import DispatcherMiddleware
from myapp import api, frontend
application = DispatcherMiddleware(frontend.create_app(), {
'/api': api.create_app()
})
This works great; the end points are all there. Inspecting application.app.url_map shows the mappings for frontend, and application.mounts['/api'].url_map shows the mappings for api correctly.
The problem I'm running into is I'd like to use url_for() in my frontend templates for methods in api, but haven't found a way to make that work. Hardcoding the URL paths works, but will cause problems later if I want to move things around.

What you can do is add a new route to your back-end, say /api/route-map which spits out a map (dictionary/JSON) of the routes (you can use url_for to generate the map) and hit this route from your front-end to get the dynamic route map, which you can use through out your front-end templates with your custom jinja2 function (which you can create as shown below).
def api_url_for(route_fn_string):
"""
This is just boilerplate code. Please do some checking here.
'"""
return route_map[route_fn_string]
app.jinja_env.globals.update(api_url_for=api_url_for)
Now you can do {{api_url_for('update')}} in your jinja2 template to get its actual route.
If you're putting both the apps on the same server, then you can simply share the route map as a global or through a getter function.

Python App Engine debug/dev mode

I'm working on an App Engine project (Python) where we'd like to make certain changes to the app's behavior when debugging/developing (most often locally). For example, when debugging, we'd like to disable our rate-limiting decorators, turn on the debug param in the WSGIApplication, maybe add some asserts.
As far as I can tell, App Engine doesn't naturally have any concept of a global dev-mode or debug-mode, so I'm wondering how best to implement such a mode. The options I've been able to come up with so far:
Use google.appengine.api.app_identity.get_default_version_hostname() to get the hostname and check if it begins with localhost. This seems... unreliable, and doesn't allow for using the debug mode in a deployed app instance.
Use os.environ.get('APPLICATION_ID') to get the application id, which according to this page is automatically prepended with dev~ by the development server. Worryingly, the very source of this information is in a box warning:
Do not get the App ID from the environment variable. The development
server simulates the production App Engine service. One way in which
it does this is to prepend a string (dev~) to the APPLICATION_ID
environment variable, which is similar to the string prepended in
production for applications using the High Replication Datastore. You
can modify this behavior with the --default_partition flag, choosing a
value of "" to match the master-slave option in production. Google
recommends always getting the application ID using get_application_id,
as described above.
Not sure if this is an acceptable use of the environment variable. Either way it's probably equally hacky, and suffers the same problem of only working with a locally running instance.
Use a custom app-id for development (locally and deployed), use the -A flag in dev_appserver.py, and use google.appengine.api.app_identity.get_application_id() in the code. I don't like this for a number of reasons (namely having to have two separate app engine projects).
Use a dev app engine version for development and detect with os.environ.get('CURRENT_VERSION_ID').split('.')[0] in code. When deployed this is easy, but I'm not sure how to make dev_appserver.py use a custom version without modifying app.yaml. I suppose I could sed app.yaml to a temp file in /tmp/ with the version replaced and the relative paths resolved (or just create a persistent dev-app.yaml), then pass that into dev_appserver.py. But that seems also kinda dirty and prone to error/sync issues.
Am I missing any other approaches? Any considerations I failed to acknowledge? Any other advice?

In regards to "detecting" localhost development we use the following in our applications settings / config file.
IS_DEV_APPSERVER = 'development' in os.environ.get('SERVER_SOFTWARE', '').lower()
That used in conjunction with the debug flag should do the trick for you.

WebApp needs to do something first time only, where to put that logic? Flask or django

The web app I am working on needs to perform a first-time setup or initialization,
where is a good place to put that logic? I dont want to perform the check if a configuration/setup exists on each request to / or before any request as thats kind of not performant.
I was thinking of performing a check if there is a sane configuration when the app starts up, then change the default route of / to a settings/setup page, and change it back. But thats like self-changing code a bit.
This is required since the web app needs settings and then to index stuff based on those settings which take a bit of time. So after the settings have been made, I still need to wait a while until the indexing is done. So even after the settings/setup has been made, any requests following, will need to see a "wait indexing" message.
Im using flask, but this is relevant for django as well I think.
EDIT: Im thinking like this now;
When starting up, check the appconfig.py for MY_SETTINGS, if it is not there
add a default from config.py and put a status=firstrun object on the app.config, also
change the / route to setup view function.
The setup view function will then check for the app.config.status object and perform
The setup of settings as necessary after user input, when the settings are okay,
remove app.config.status or change it to "indexing", then I can have a before_request function to check for the app.config.status just to flash a message of it.
Or I could use the flask.g instead of app.config to store the status?

The proper way is creating a CLI script, preferably via Flask-Script if you use Flask (in Django it would be the default manage.py where you can easily add custom commands, too) and defining a function such as init or install:
from flaskext.script import Manager
from ... import app
manager = Manager(app)
#manager.command
def init():
"""Initialize the application"""
# your code here
Then you mention it in your documentation and can easily assume that it has been run when the web application itself is accessed.

Using CherryPy/Cherryd to launch multiple Flask instances

Per suggestions on SO/SF and other sites, I am using CherryPy as the WSGI server to launch multiple instances of a Python web server I built with Flask. Each instance runs on its own port and sits behind Nginx. I should note that the below does work for me, but I'm troubled that I have gone about things the wrong way and it works "by accident".
Here is my current cherrypy.conf file:
[global]
server.socket_host = '0.0.0.0'
server.socket_port = 8891
request.dispatch: cherrypy.dispatch.MethodDispatcher()
tree.mount = {'/':my_flask_server.app}
Without diving too far into my Flask server, here's how it starts:
import flask
app = flask.Flask(__name__)
#app.route('/')
def hello_world():
return "hello"
And here is the command I issue on the command line to launch with Cherryd:
cherryd -c cherrypy.conf -i my_flask_server
Questions are:
Is wrapping Flask inside CherryPy still the preferred method of using Flask in production? https://stackoverflow.com/questions/4884541/cherrypy-vs-flask-werkzeug
Is this the proper way to use a .conf file to launch CherryPy and import the Flask app? I have scoured the CherryPy documentation, but I cannot find any use cases that match what I am trying to do here specifically.
Is the proper way to launch multiple CherryPy/Flask instances on a single machine to execute multiple cherryd commands (daemonizing with -d, etc) with unique .conf files for each port to be used (8891, 8892, etc)? Or is there a better "CherryPy" way to accomplish this?
Thanks for any help and insight.

I can't speak for Flask, but I can for CherryPy. That looks like the "proper way"...mostly. That line about a MethodDispatcher is a no-op since it only affects CherryPy Applications, and you don't appear to have mounted any (just a single Flask app instead).
Regarding point 3, you have it right. CherryPy allows you to run multiple Server objects in the same process in order to listen on multiple ports (or protocols), but it doesn't have any sugar for starting up multiple processes. As you say, multiple cherryd commands with varying config files is how to do it (unless you want to use a more integrated cluster/config management tool like eggmonster).

Terminology: Mounting vs Grafting
In principle this is a proper way to serve a flask app through cherrypy, just a quick note on your naming:
It is worth noting here that tree.mount is not a configuration key by itself - tree will lead to cherrypy._cpconfig._tree_config_handler(k, v) being called with the arguments 'mount', {'/': my_flask_server.app}.
The key parameter is not used at all by the _tree_config_handler so in your config "mount" is just an arbitrary label for that specific dict of path mappings. It also does not "mount" the application (it's not a CherryPy app after all). By that I mean, it does not cherrypy.tree.mount(…) it but rather cherrypy.tree.grafts an arbitrary WSGI handler onto your "script-name" (paths, but in CherryPy terminology) namespace.
Cherrypy's log message somewhat misleadingly says "Mounted <app as string> on /"]
This is a somewhat important point since with graft, unlike mount, you cannot specify further options such as static file service for your app or streaming responses on that path.
So I would recommend changing the tree.mount config key to something descriptive that does not invite reading too much semantics about what happens within CherryPy (since there is the cherrypy.tree.mount method) due to that config. E.g., tree.flask_app_name if you're just mapping that one app in that dict (there can be many tree directives, all of them just getting merged into the paths namespace) or tree.wsgi_delegates if you map many apps in that dict.
Using CherryPy to serve additional content without making an app of it
Another side note, if you want cherrypy to e.g. provide static file service for your app, you don't have to create a boilerplate cherrypy app to hold that configuration. You just have to mount None with the appropriate additional config. The following files would suffice to have CherryPy to serve static content from the subdirectory 'static' if they are put into the directory where you launch cherryd to serve static content (invoke cherryd as cherryd -c cherrypy.conf -i my_flask_server -i static:
static.py
import cherrypy
# next line could also have config as an inline dict, but
# file config is often easier to handle
cherrypy.tree.mount(None, '/static-path', 'static.conf')
static.conf
# static.conf
[/]
tools.staticdir.on = True
tools.staticdir.root = os.getcwd()
tools.staticdir.dir = 'static'
tools.staticdir.index = 'index.html'

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.