Stubs for local Django unittests with Google App Engine

Stubs for local Django unittests with Google App Engine - python

I'd like to run local Django unit tests for a Google App Engine project. GAE recently received some python unit testing utilities that allow one to create stubs for e.g. memcache, the datastore, the task queue, etc.
I'd like to be able to use Django's unit testing framework. My first thought is to overload DjangoTestSuiteRunner to do the following for each test case:
# setUp
self.testbed = testbed.Testbed()
# Then activate the testbed, which prepares the service stubs for use.
self.testbed.activate()
# Next, declare which service stubs you want to use.
self.testbed.init_datastore_v3_stub()
self.testbed.init_memcache_stub()
# ... after tests:
#
# Teardown
self.testbed.deactivate()
I'd like to know if anyone else has tried to run Django's testing framework with the new unittests that can be run from the command line for GAE, and if so what pitfalls they've encountered. For example, are there any issues with calling Django's django.test.utils.setup_test_environment and teardown_test_environment? What other issues might come up?
Incidentally, I'm not using any Django-GAE helpers such as google-app-engine-django.
Thank you for reading.

Just wanted to mention: standard django unit testing worked very nice for me with django-nonrel and GAE Test Bed, including task-queues, memcache, etc. I think it is the same python unit testing code that you mentioned.

Related

How to mock test Flask endpoints that mainly interact with 3rd party APIs?

We have developed various services using Flask, and all of them really act as an intermediary between the enduser and various 3rd party APIs (but no databases). I have developed tests using pytest with a test_client fixture etc and it works well. But each time a route/endpoint is called by one of the tests, it actually interacts with the 3rd party APIs.
What is the proper way to make mock tests in this situation? I am wondering if I should just develop an exact copy of the endpoints, but simulate the lines where requests to 3rd party APIs occur. That seems like a lot of work, but if I were to just return some expected JSON from a given endpoint then I don't think the code would be tested very well.

For scenarios where you need to call third-party API, you could consider using Pytest VCR.
It is a pytest plugin that allow recording of all http requests made inside a test method or function by using a decorator as follow.
#pytest.mark.vcr()
def my_test():
"""
Example of a function that test a client doing http requests
"""
some_client = SomeClient()
some_client.authenticate()
resource = some_client.get_resource("foo")
assert resource is not None
First time you run your test, calls are recorded into a file called a "cassette". Then subsequents run of this test will use this cassette as a kind of mock instead of sending real http calls.
It works as well if you are testing a flask application views in which https calls are done.
Be careful as this solution implies that if you change tested code in a way that add http calls, the test will break as new calls wont exists in the cassette. You will either have to delete cassette and record it again, or run your test with --vcr-record=new_episodes to add new calls to existing cassette.

Hot reloading properties in a Python Flask/Django app

Gurus, Wizards, Geeks
I am tasked with providing Python Flask apps (more generally, webapps written in python) a way to reload properties on the fly.
Specifically, my team and I currently deploy python apps with a {env}.properties file that contains various environment specific configurations in a key value format (yaml for instance). Ideally, these properties are reloaded by the app when changed. Suppose a secondary application existed that updates the previously mentioned {env}.properties file, the application should ALSO be able to read and use new values.
Currently, we read the {env}.properties at startup and the values are accessed via functions stored in a context.py. I could write a function that could periodically update the variables. Before starting an endeavor like this, I thought I would consult the collective to see if someone else has solved this for Django or Flask projects (as it seems like a reasonable request for feature flags, etc).

One such pattern is the WSGI application factory pattern.
In short, you define a function that instantiates the application object. This pattern works with all WSGI-based frameworks.
The Flask docs explain application factories pretty well.
This allows you to define the application dynamically on-the-fly, without the need to redeploy or deploy many configurations of an application. You can change just about anything about the app this way, including configuration, routes, middlewares, and more.
A simple example of this would be something like:
def get_settings(env):
"""get the (current, updated) application settings"""
...
return settings
def create_app(env: str):
if env not in ('dev', 'staging', 'production'):
raise ValueError(f'{env} is not a valid environment')
app = Flask(__name__)
app.config.update(get_settings(env))
return app
Then, you could set FLASK_APP environment variable to something like "myapp:create_app('dev')" and that would do it. This is also the same way you could specify this for servers like gunicorn.
The get_settings function should be written to return the newest settings. It could even do something like retrieve settings from an external source like S3, a config service, or anything.

Concepts, use and testing Cloud Datastore in local

I'm really confused with the way to try Datastore in local. Please, give me a minute to explain.
I'm developing a app composed to few microservices like a only gae app. In a parte of the app, I use the datastore. So when I run my app, I use the development server and when I save something in the datastore calling some method I can see perfectly the entity in the gae's admin web portal.
Well, now, instead of calling directly to ndb library and his methods I've built a small library over ndb to abstract his functionallity, then I can call insertUser() instead of work directly with ndb. So, the problems appear when I try test this small library that I built (I've written a test.py file to do this).
At first, I thought that this does not can work because this test was executing without the deveserver running. After I searched info about how simulated the datastore in the local and I found this, but after I found too the unittest in local with the stubs, and now I don't understand nothing.
I've tried both (gcloud datastore emulator and stub with unittest) and I don't get do simple example:
I want test that a entity is saved in Datastore and after I want test that I can read this entity
I suppose that dev_server (in SDK) emulate the datastore (because I can see the list of my entities there), but then, why use the datastore emulator in local dev?, and then, why is necesary uses the stub to datastore if we have a datastore emulator to do all test that I want? I don't understand.
I understand that maybe my question is more of concepts than code but I need understand really right how is the best way to work with this.

Finally I think I solved and understood my problem. If I were working with other system that I want connect to Cloud Datastore, I would need use the "emulator". But isn't my case. So, I need use the stubs with unittest because there are not a simple way (I think is imposible) to do this with the dev_server (when he is running).
But i found two mainly problems:
The first, the way to import google_appengine libraries, because in the documentation isn't very clear, (in my view), finally searching user opinions I found that "my solution was something like this":
sys.path.insert(1, '../../../../google_appengine')
if 'google' in sys.modules:
del sys.modules['google']
from google.appengine.ext import ndb
from google.appengine.ext import testbed
The second was that when I execute a test (one of few of I had) the next unittest failed, for example, when in the first unittest, I save the data and in the second I test if the data is saved correctly with a read method.
When I initialized datastore_v3_stub I use save_changes=True to specify that I want the changes be permanent, but when I use it, don't work and I see that the changes maybe don't be saved.
After, I found in the tesbed docs the param datastore_file, when I used this and specify a file where save temporarily the database, all tests began to work fine.
self.testbed.init_datastore_v3_stub(enable=True, save_changes=True, datastore_file='./dbFile')
Besides, I added a final condition (unittest library) to delete this file, so, I erase the file when the test ends. (Avoiding errors in the next execution).
#classmethod
def tearDownClass(self):
"""
Elimina el fichero de la bd temporal tras la ejecución de todos los tests.
"""
os.remove('./dbFile')
I think that GAE and all Google Cloud Platform is a very good solution to develop fast apps but I think too that they need revise and extend his docs, specially to no-experts programmers (like me).
I hope that this solution maybe help someone, if you think that I have some error please comment it.

Why is App Engine Returning the Wrong Application ID?

The App Engine Dev Server documentation says the following:
The development server simulates the production App Engine service. One way in which it does this is to prepend a string (dev~) to the APPLICATION_IDenvironment variable. Google recommends always getting the application ID using get_application_id
In my application, I use different resources locally than I do on production. As such, I have the following for when I startup the App Engine instance:
import logging
from google.appengine.api.app_identity import app_identity
# ...
# other imports
# ...
DEV_IDENTIFIER = 'dev~'
application_id = app_identity.get_application_id()
is_development = DEV_IDENTIFIER in application_id
logging.info("The application ID is '%s'")
if is_development:
logging.warning("Using development configuration")
# ...
# set up application for development
# ...
# ...
Nevertheless, when I start my local dev server via the command line with dev_appserver.py app.yaml, I get the following output in my console:
INFO: The application ID is 'development-application'
WARNING: Using development configuration
Evidently, the dev~ identifier that the documentation claims will be preprended to my application ID is absent. I have also tried to use the App Engine Launcher UI to see if that changed anything, but it did not.
Note that 'development-application' is the name of my actual application, but I expected it to be 'dev~development-application'.

Google recommends always getting the application ID using get_application_id
But, that's if you cared about the application ID -- you don't: you care about the partition. Check out the source -- it's published at https://code.google.com/p/googleappengine/source/browse/trunk/python/google/appengine/api/app_identity/app_identity.py .
get_app_identity uses os.getenv('APPLICATION_ID') then passes that to internal function _ParseFullAppId -- which splits it by _PARTITION_SEPARATOR = '~' (thus removing again the dev~ prefix that dev_appserver.py prepended to the environment variable). That's returned as the "partition" to get_app_identity (which ignores it, only returning the application ID in the strict sense).
Unfortunately, there is no architected way to get just the partition (which is in fact all you care about).
I would recommend that, to distinguish whether you're running locally or "in production" (i.e, on Google's servers at appspot.com), in order to access different resources in each case, you take inspiration from the way Google's own example does it -- specifically, check out the app.py example at https://cloud.google.com/appengine/docs/python/cloud-sql/#Python_Using_a_local_MySQL_instance_during_development .
In that example, the point is to access a Cloud SQL instance if you're running in production, but a local MySQL instance instead if you're running locally. But that's secondary -- let's focus instead on, how does Google's own example tell which is the case? The relevant code is...:
if (os.getenv('SERVER_SOFTWARE') and
os.getenv('SERVER_SOFTWARE').startswith('Google App Engine/')):
...snipped: what to do if you're in production!...
else:
...snipped: what to do if you're in the local server!...
So, this is the test I'd recommend you use.
Well, as a Python guru, I'm actually slightly embarassed that my colleagues are using this slightly-inferior Python code (with two calls to os.getenv) -- me, I'd code it as follows...:
in_prod = os.getenv('SERVER_SOFTWARE', '').startswith('Google App Engine/')
if in_prod:
...whatever you want to do if we're in production...
else:
...whatever you want to do if we're in the local server...
but, this is exactly the same semantics, just expressed in more elegant Python (exploiting the second optional argument to os.getenv to supply a default value).
I'll be trying to get this small Python improvement into that example and to also place it in the doc page you were using (there's no reason anybody just needing to find out if their app is being run in prod or locally should ever have looked at the docs about Cloud SQL use -- so, this is a documentation goof on our part, and, I apologize). But, while I'm working to get our docs improved, I hope this SO answer is enough to let you proceed confidently.

That documentation seems wrong, when I run the commands locally it just spits out the name from app.yaml.
That being said, we use
import os
os.getenv('SERVER_SOFTWARE', '').startswith('Dev')
to check if it is the dev appserver.

Python App Engine debug/dev mode

I'm working on an App Engine project (Python) where we'd like to make certain changes to the app's behavior when debugging/developing (most often locally). For example, when debugging, we'd like to disable our rate-limiting decorators, turn on the debug param in the WSGIApplication, maybe add some asserts.
As far as I can tell, App Engine doesn't naturally have any concept of a global dev-mode or debug-mode, so I'm wondering how best to implement such a mode. The options I've been able to come up with so far:
Use google.appengine.api.app_identity.get_default_version_hostname() to get the hostname and check if it begins with localhost. This seems... unreliable, and doesn't allow for using the debug mode in a deployed app instance.
Use os.environ.get('APPLICATION_ID') to get the application id, which according to this page is automatically prepended with dev~ by the development server. Worryingly, the very source of this information is in a box warning:
Do not get the App ID from the environment variable. The development
server simulates the production App Engine service. One way in which
it does this is to prepend a string (dev~) to the APPLICATION_ID
environment variable, which is similar to the string prepended in
production for applications using the High Replication Datastore. You
can modify this behavior with the --default_partition flag, choosing a
value of "" to match the master-slave option in production. Google
recommends always getting the application ID using get_application_id,
as described above.
Not sure if this is an acceptable use of the environment variable. Either way it's probably equally hacky, and suffers the same problem of only working with a locally running instance.
Use a custom app-id for development (locally and deployed), use the -A flag in dev_appserver.py, and use google.appengine.api.app_identity.get_application_id() in the code. I don't like this for a number of reasons (namely having to have two separate app engine projects).
Use a dev app engine version for development and detect with os.environ.get('CURRENT_VERSION_ID').split('.')[0] in code. When deployed this is easy, but I'm not sure how to make dev_appserver.py use a custom version without modifying app.yaml. I suppose I could sed app.yaml to a temp file in /tmp/ with the version replaced and the relative paths resolved (or just create a persistent dev-app.yaml), then pass that into dev_appserver.py. But that seems also kinda dirty and prone to error/sync issues.
Am I missing any other approaches? Any considerations I failed to acknowledge? Any other advice?

In regards to "detecting" localhost development we use the following in our applications settings / config file.
IS_DEV_APPSERVER = 'development' in os.environ.get('SERVER_SOFTWARE', '').lower()
That used in conjunction with the debug flag should do the trick for you.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.