I need to execute some housekeeping code but only in development or production environment. Unfortunately all management commands execute similar to runserver. Is there any clean way to classify what is the execution environment and run the code selectively.
I saw some solutions like 'runserver' in sys.argv
but it does not work for production. And does not look very clean.
Does django provide anything to classify all these different scenarios code is executing at?
Edit
The real problem is we need to initialise our local cache once the apps are loaded with some data that are frequently accessed. In general I want to fetch DB for some specific information and cache it (currently in memory). The issue is, when it tries to fetch DB, the table may not be created, in fact there may not be migration files created at all. So, when I run makemigrations/migrate, it will run this code which tries to fetch from DB, and throw error saying table does not exist. But if I can't run makemigration/migrate, there will be no table, it is kind of a loop I'm trying to avoid. The part of code will run for all management commands, but I would like to restrict it's execution only to when the app is actually serving requests (that is when the cache is needed) and not for any management commands (including the user defined ones).
```python
from django.apps import AppConfig
from my_app.signals import app_created
class MyAppConfig(AppConfig):
name = 'my_app'
def ready(self):
import my_app.signals
# Code below should be executed only in actual app execution
# And not in makemigration/migrate etc management commands
app_created.send(sender=MyAppConfig, sent_by="MyApp")
```
Q) Send app created signal for app execution other than executions due to management commands like makemigrations, migrate, etc.
There are so many different ways to do this. But generally when I create a production (or staging, or development) server I set an environment variable. And dynamically decide which settings file to load based on that environment variable.
Imagine something like this in a Django settings file:
import os
ENVIRONMENT = os.environ.get('ENVIRONMENT', 'development')
Then you can use
from django.conf import settings
if settings.ENVIRONMENT == 'production':
# do something only on production
Since, I did not get an convincing answer and I managed to pull off a solution, although not a 100% clean. I thought I would just share solution I ended up with.
import sys
from django.conf import settings
if (settings.DEBUG and 'runserver' in sys.argv) or not settings.DEBUG:
"""your code to run only in development and production"""
The rationale is you run the code if it is not in DEBUG mode no matter what. But if it is in DEBUG mode check if the process execution had runserver in the arguments.
Related
Gurus, Wizards, Geeks
I am tasked with providing Python Flask apps (more generally, webapps written in python) a way to reload properties on the fly.
Specifically, my team and I currently deploy python apps with a {env}.properties file that contains various environment specific configurations in a key value format (yaml for instance). Ideally, these properties are reloaded by the app when changed. Suppose a secondary application existed that updates the previously mentioned {env}.properties file, the application should ALSO be able to read and use new values.
Currently, we read the {env}.properties at startup and the values are accessed via functions stored in a context.py. I could write a function that could periodically update the variables. Before starting an endeavor like this, I thought I would consult the collective to see if someone else has solved this for Django or Flask projects (as it seems like a reasonable request for feature flags, etc).
One such pattern is the WSGI application factory pattern.
In short, you define a function that instantiates the application object. This pattern works with all WSGI-based frameworks.
The Flask docs explain application factories pretty well.
This allows you to define the application dynamically on-the-fly, without the need to redeploy or deploy many configurations of an application. You can change just about anything about the app this way, including configuration, routes, middlewares, and more.
A simple example of this would be something like:
def get_settings(env):
"""get the (current, updated) application settings"""
...
return settings
def create_app(env: str):
if env not in ('dev', 'staging', 'production'):
raise ValueError(f'{env} is not a valid environment')
app = Flask(__name__)
app.config.update(get_settings(env))
return app
Then, you could set FLASK_APP environment variable to something like "myapp:create_app('dev')" and that would do it. This is also the same way you could specify this for servers like gunicorn.
The get_settings function should be written to return the newest settings. It could even do something like retrieve settings from an external source like S3, a config service, or anything.
I'm working on an App Engine project (Python) where we'd like to make certain changes to the app's behavior when debugging/developing (most often locally). For example, when debugging, we'd like to disable our rate-limiting decorators, turn on the debug param in the WSGIApplication, maybe add some asserts.
As far as I can tell, App Engine doesn't naturally have any concept of a global dev-mode or debug-mode, so I'm wondering how best to implement such a mode. The options I've been able to come up with so far:
Use google.appengine.api.app_identity.get_default_version_hostname() to get the hostname and check if it begins with localhost. This seems... unreliable, and doesn't allow for using the debug mode in a deployed app instance.
Use os.environ.get('APPLICATION_ID') to get the application id, which according to this page is automatically prepended with dev~ by the development server. Worryingly, the very source of this information is in a box warning:
Do not get the App ID from the environment variable. The development
server simulates the production App Engine service. One way in which
it does this is to prepend a string (dev~) to the APPLICATION_ID
environment variable, which is similar to the string prepended in
production for applications using the High Replication Datastore. You
can modify this behavior with the --default_partition flag, choosing a
value of "" to match the master-slave option in production. Google
recommends always getting the application ID using get_application_id,
as described above.
Not sure if this is an acceptable use of the environment variable. Either way it's probably equally hacky, and suffers the same problem of only working with a locally running instance.
Use a custom app-id for development (locally and deployed), use the -A flag in dev_appserver.py, and use google.appengine.api.app_identity.get_application_id() in the code. I don't like this for a number of reasons (namely having to have two separate app engine projects).
Use a dev app engine version for development and detect with os.environ.get('CURRENT_VERSION_ID').split('.')[0] in code. When deployed this is easy, but I'm not sure how to make dev_appserver.py use a custom version without modifying app.yaml. I suppose I could sed app.yaml to a temp file in /tmp/ with the version replaced and the relative paths resolved (or just create a persistent dev-app.yaml), then pass that into dev_appserver.py. But that seems also kinda dirty and prone to error/sync issues.
Am I missing any other approaches? Any considerations I failed to acknowledge? Any other advice?
In regards to "detecting" localhost development we use the following in our applications settings / config file.
IS_DEV_APPSERVER = 'development' in os.environ.get('SERVER_SOFTWARE', '').lower()
That used in conjunction with the debug flag should do the trick for you.
In google app engine i have created my own user API appropriately called user so it doesn't interfere with the google app engine API users. Like most multiuser websites, two "versions" of the site are available to the user depending on whether or not they are logged in. Thus is created a file called router.py with the following code
import webapp2
from lib import user
import guest
import authorized
if user.isLoggedIn():
app = webapp2.WSGIApplication(authorized.WSGIHandler,debug=True)
else:
app = webapp2.WSGIApplication(guest.WSGIHandler,debug=True)
the guest and authorized modules are formated like your conventional application script for example:
import webapp2
import os
class MainPage(webapp2.RequestHandler):
def get(self,_random):
self.response.out.write('authorized: '+_random)
WSGIHandler = [('/(.*)', MainPage)]
Thus the router file easily selects which WSGIApplication url director to use by grabbing the WSGIHandler variable from either the guest or authorized module. However the user must close all tabs for the router to detect a change in the isLoggedIn() function. If you log in it does not recognize that you have done so until every tab is closed. I have two possible reasons for this:
isLoggedIn() uses os.environ['HTTP_COOKIE'] to retrieve cookies and see if a user is logged in, it then checks the cookie data against the database to make sure they are valid cookie. Possibly this could have an error where the cookies on the server's end aren't being refreshed when the page is? Maybe because i'm not getting the cookies from self.request.
Is it possible that in order to conserve frontend hours or something that google app engine caches the scripts from the server with in the server's memcache? i Doubt it but i am at a loss for the reason for this behavior.
Thanks in advance for the help
Edit
Upon more testing i found that as suspected the router.py file responded correctly and directed the user based in logged in when a comment was added to it. This seems to indicate caching.
Edit 2
I have uncovered some more information on the WSHI Application:
The Python runtime environment caches imported modules between requests on a single web server, similar to how a standalone Python application loads a module only once even if the module is imported by multiple files. Since WSGI handlers are modules, they are cached between requests. CGI handler scripts are only cached if they provide a main() routine; otherwise, the CGI handler script is loaded for every request.
I wonder how efficient to refresh the WSGI module somehow. This would undoubtably task the server, but solve my problem. Again, this seems to be a partial solution.
Edit 3
Again, any attempt to randomize a comment in the router.py file is ineffective. The id statement looking for user login is completely overlooked and the WSGIApplication is set to its original state. I'm not yet sure if this is due to the module cache in the webapp2 module or thanks to the module cache on the user API. I suspect the latter.
The problem is not "caching", it is just how WSGI applications work. A WSGI process stays alive for a reasonably long period of time, and serves multiple requests during that period. The app is defined when that process is started up, and does not change until the process is renewed. You should not try to do anything dynamic or request-dependent at that point.
replace router.py with:
from google.appengine.ext import webapp
from google.appengine.ext.webapp.util import run_wsgi_app
from lib import user
import guest
import authorized
def main():
if user.isLoggedIn():
run_wsgi_app(authorized.application)
else:
run_wsgi_app(guest.application)
if __name__ == "__main__":
main()
downgrading to the old webapp allows you to change dynamically the wsgi application. it's tested and works perfectly! The CGI adaptor run_wsgi_app allows for the webapp to change it's directory list without caching.
The web app I am working on needs to perform a first-time setup or initialization,
where is a good place to put that logic? I dont want to perform the check if a configuration/setup exists on each request to / or before any request as thats kind of not performant.
I was thinking of performing a check if there is a sane configuration when the app starts up, then change the default route of / to a settings/setup page, and change it back. But thats like self-changing code a bit.
This is required since the web app needs settings and then to index stuff based on those settings which take a bit of time. So after the settings have been made, I still need to wait a while until the indexing is done. So even after the settings/setup has been made, any requests following, will need to see a "wait indexing" message.
Im using flask, but this is relevant for django as well I think.
EDIT: Im thinking like this now;
When starting up, check the appconfig.py for MY_SETTINGS, if it is not there
add a default from config.py and put a status=firstrun object on the app.config, also
change the / route to setup view function.
The setup view function will then check for the app.config.status object and perform
The setup of settings as necessary after user input, when the settings are okay,
remove app.config.status or change it to "indexing", then I can have a before_request function to check for the app.config.status just to flash a message of it.
Or I could use the flask.g instead of app.config to store the status?
The proper way is creating a CLI script, preferably via Flask-Script if you use Flask (in Django it would be the default manage.py where you can easily add custom commands, too) and defining a function such as init or install:
from flaskext.script import Manager
from ... import app
manager = Manager(app)
#manager.command
def init():
"""Initialize the application"""
# your code here
Then you mention it in your documentation and can easily assume that it has been run when the web application itself is accessed.
I have a Pylons app where I would like to move some of the logic to a separate batch process. I've been running it under the main app for testing, but it is going to be doing a lot of work in the database, and I'd like it to be a separate process that will be running in the background constantly. The main pylons app will submit jobs into the database, and the new process will do the work requested in each job.
How can I launch a controller as a stand alone script?
I currently have:
from warehouse2.controllers import importServer
importServer.runServer(60)
and in the controller file, but not part of the controller class:
def runServer(sleep_secs):
try:
imp = ImportserverController()
while(True):
imp.runImport()
sleepFor(sleep_secs)
except Exception, e:
log.info("Unexpected error: %s" % sys.exc_info()[0])
log.info(e)
But starting ImportServer.py on the command line results in:
2008-09-25 12:31:12.687000 Could not locate a bind configured on mapper Mapper|I
mportJob|n_imports, SQL expression or this Session
If you want to load parts of a Pylons app, such as the models from outside Pylons, load the Pylons app in the script first:
from paste.deploy import appconfig
from pylons import config
from YOURPROJ.config.environment import load_environment
conf = appconfig('config:development.ini', relative_to='.')
load_environment(conf.global_conf, conf.local_conf)
That will load the Pylons app, which sets up most of the state so that you can proceed to use the SQLAlchemy models and Session to work with the database.
Note that if your code is using the pylons globals such as request/response/etc then that won't work since they require a request to be in progress to exist.
I'm redacting my response and upvoting the other answer by Ben Bangert, as it's the correct one. I answered and have since learned the correct way (mentioned below). If you really want to, check out the history of this answer to see the wrong (but working) solution I originally proposed.