How to set up global flag in appengine standard

How to set up global flag in appengine standard - python

I need to set a global application setting in my app. I don't really want to use the database as it's just a single flag. I can't use Memcache because it's not durable. Not sure if env variable is shared will al instances after the change?
This setting might be changed in the app (like once a month)
Is there any other Google Service where I could place it? There will be quite a lot of reads

Here are a few options:
Store it in your code.
Store it in an environment variable. This will be the same for all instances.
Store it in datastore but read the value into a Python module variable so you don't need to access the datastore to use it after the first time.
Google recently deployed "Secrets management". I haven't tried it yet, but it could work for this.

Related

Hot reloading properties in a Python Flask/Django app

Gurus, Wizards, Geeks
I am tasked with providing Python Flask apps (more generally, webapps written in python) a way to reload properties on the fly.
Specifically, my team and I currently deploy python apps with a {env}.properties file that contains various environment specific configurations in a key value format (yaml for instance). Ideally, these properties are reloaded by the app when changed. Suppose a secondary application existed that updates the previously mentioned {env}.properties file, the application should ALSO be able to read and use new values.
Currently, we read the {env}.properties at startup and the values are accessed via functions stored in a context.py. I could write a function that could periodically update the variables. Before starting an endeavor like this, I thought I would consult the collective to see if someone else has solved this for Django or Flask projects (as it seems like a reasonable request for feature flags, etc).

One such pattern is the WSGI application factory pattern.
In short, you define a function that instantiates the application object. This pattern works with all WSGI-based frameworks.
The Flask docs explain application factories pretty well.
This allows you to define the application dynamically on-the-fly, without the need to redeploy or deploy many configurations of an application. You can change just about anything about the app this way, including configuration, routes, middlewares, and more.
A simple example of this would be something like:
def get_settings(env):
"""get the (current, updated) application settings"""
...
return settings
def create_app(env: str):
if env not in ('dev', 'staging', 'production'):
raise ValueError(f'{env} is not a valid environment')
app = Flask(__name__)
app.config.update(get_settings(env))
return app
Then, you could set FLASK_APP environment variable to something like "myapp:create_app('dev')" and that would do it. This is also the same way you could specify this for servers like gunicorn.
The get_settings function should be written to return the newest settings. It could even do something like retrieve settings from an external source like S3, a config service, or anything.

Is it possible to use Whoosh search in a serverless environment?

I'm trying to set up Whoosh search in a serverless environment (aws lambda hosted api) and having trouble with Whoosh since it hosts the index on the local filesystem. That becomes an issue with containers that aren't able to update and reference a single index.
Does anyone know if there is a solution to this problem. I am able to select the location that the directory is hosted but it has to be on the local filesystem. Is there a way to represent an s3 file as a local file?
I'm currently having to reindex every time the app is initialized and while it works it's clearly an expensive and terrible workaround.

The answer seems to be no. Serverless environments by default are ephemeral and don't support persistent data storage which is needed for something like storing an index that Whoosh generates.

You can always use Whoosh in RAM.
from whoosh.filedb.filestore import RamStorage
store = RamStorage()
ix = store.create_index(...)

Concepts, use and testing Cloud Datastore in local

I'm really confused with the way to try Datastore in local. Please, give me a minute to explain.
I'm developing a app composed to few microservices like a only gae app. In a parte of the app, I use the datastore. So when I run my app, I use the development server and when I save something in the datastore calling some method I can see perfectly the entity in the gae's admin web portal.
Well, now, instead of calling directly to ndb library and his methods I've built a small library over ndb to abstract his functionallity, then I can call insertUser() instead of work directly with ndb. So, the problems appear when I try test this small library that I built (I've written a test.py file to do this).
At first, I thought that this does not can work because this test was executing without the deveserver running. After I searched info about how simulated the datastore in the local and I found this, but after I found too the unittest in local with the stubs, and now I don't understand nothing.
I've tried both (gcloud datastore emulator and stub with unittest) and I don't get do simple example:
I want test that a entity is saved in Datastore and after I want test that I can read this entity
I suppose that dev_server (in SDK) emulate the datastore (because I can see the list of my entities there), but then, why use the datastore emulator in local dev?, and then, why is necesary uses the stub to datastore if we have a datastore emulator to do all test that I want? I don't understand.
I understand that maybe my question is more of concepts than code but I need understand really right how is the best way to work with this.

Finally I think I solved and understood my problem. If I were working with other system that I want connect to Cloud Datastore, I would need use the "emulator". But isn't my case. So, I need use the stubs with unittest because there are not a simple way (I think is imposible) to do this with the dev_server (when he is running).
But i found two mainly problems:
The first, the way to import google_appengine libraries, because in the documentation isn't very clear, (in my view), finally searching user opinions I found that "my solution was something like this":
sys.path.insert(1, '../../../../google_appengine')
if 'google' in sys.modules:
del sys.modules['google']
from google.appengine.ext import ndb
from google.appengine.ext import testbed
The second was that when I execute a test (one of few of I had) the next unittest failed, for example, when in the first unittest, I save the data and in the second I test if the data is saved correctly with a read method.
When I initialized datastore_v3_stub I use save_changes=True to specify that I want the changes be permanent, but when I use it, don't work and I see that the changes maybe don't be saved.
After, I found in the tesbed docs the param datastore_file, when I used this and specify a file where save temporarily the database, all tests began to work fine.
self.testbed.init_datastore_v3_stub(enable=True, save_changes=True, datastore_file='./dbFile')
Besides, I added a final condition (unittest library) to delete this file, so, I erase the file when the test ends. (Avoiding errors in the next execution).
#classmethod
def tearDownClass(self):
"""
Elimina el fichero de la bd temporal tras la ejecución de todos los tests.
"""
os.remove('./dbFile')
I think that GAE and all Google Cloud Platform is a very good solution to develop fast apps but I think too that they need revise and extend his docs, specially to no-experts programmers (like me).
I hope that this solution maybe help someone, if you think that I have some error please comment it.

Search for a key in django.core.cache

I am writing a simple livechat app using django. I keep data about chat sessions in a static variable on my Chat class. Locally it really works.
I have deployed a test version of an app on heroku, but heroku is a cloud platform. There is no synchronization between class variables in different threads.
So I decided to use memcached. But I can't find if django.core.cache allows search for a key in cache or iterate through entire cache to check values. What is the best way to solve this problem?

Memcached only allows you to get/set entries by their keys. You can't iterate these entries to check something. But if your cache keys are sequential (like sess1, sess2, etc.) you can try to check for existence in a loop:
for i in range(1000):
sess = cache.get('sess%s' % i)
# some logic
But anyway it seems like a bad design decision. I don't have enough information about what you're doing but I guess that some sort of persistent storage (like database) would work nice. You can also consider http://redis.io/ which has more features than memcached but still very fast.

Python App Engine debug/dev mode

I'm working on an App Engine project (Python) where we'd like to make certain changes to the app's behavior when debugging/developing (most often locally). For example, when debugging, we'd like to disable our rate-limiting decorators, turn on the debug param in the WSGIApplication, maybe add some asserts.
As far as I can tell, App Engine doesn't naturally have any concept of a global dev-mode or debug-mode, so I'm wondering how best to implement such a mode. The options I've been able to come up with so far:
Use google.appengine.api.app_identity.get_default_version_hostname() to get the hostname and check if it begins with localhost. This seems... unreliable, and doesn't allow for using the debug mode in a deployed app instance.
Use os.environ.get('APPLICATION_ID') to get the application id, which according to this page is automatically prepended with dev~ by the development server. Worryingly, the very source of this information is in a box warning:
Do not get the App ID from the environment variable. The development
server simulates the production App Engine service. One way in which
it does this is to prepend a string (dev~) to the APPLICATION_ID
environment variable, which is similar to the string prepended in
production for applications using the High Replication Datastore. You
can modify this behavior with the --default_partition flag, choosing a
value of "" to match the master-slave option in production. Google
recommends always getting the application ID using get_application_id,
as described above.
Not sure if this is an acceptable use of the environment variable. Either way it's probably equally hacky, and suffers the same problem of only working with a locally running instance.
Use a custom app-id for development (locally and deployed), use the -A flag in dev_appserver.py, and use google.appengine.api.app_identity.get_application_id() in the code. I don't like this for a number of reasons (namely having to have two separate app engine projects).
Use a dev app engine version for development and detect with os.environ.get('CURRENT_VERSION_ID').split('.')[0] in code. When deployed this is easy, but I'm not sure how to make dev_appserver.py use a custom version without modifying app.yaml. I suppose I could sed app.yaml to a temp file in /tmp/ with the version replaced and the relative paths resolved (or just create a persistent dev-app.yaml), then pass that into dev_appserver.py. But that seems also kinda dirty and prone to error/sync issues.
Am I missing any other approaches? Any considerations I failed to acknowledge? Any other advice?

In regards to "detecting" localhost development we use the following in our applications settings / config file.
IS_DEV_APPSERVER = 'development' in os.environ.get('SERVER_SOFTWARE', '').lower()
That used in conjunction with the debug flag should do the trick for you.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.