LiveServerTestCase server sees different database to tests

LiveServerTestCase server sees different database to tests - python

I have some code (a celery task) which makes a call via urllib to a Django view. The code for the task and the view are both part of the same Django project.
I'm testing the task, and need it to be able to contact the view and get data back from it during the test, so I'm using a LiveServerTestCase. In theory I set up the database in the setUp function of my test case (I add a list of product instances) and then call the task, it does some stuff, and then calls the Django view through urllib (hitting the dev server set up by the LiveServerTestCase), getting a JSON list of product instances back.
In practice, though, it looks like the products I add in setUp aren't visible to the view when it's called. It looks like the test case code is using one database (test_<my_database_name>) and the view running on the dev server is accessing another (the urllib call successfully contacts the view but can't find the product I've asked for).
Any ideas why this may be the case?
Might be relevant - we're testing on a MySQL db instead of the sqlite.
Heading off two questions (but interested in comments if you think we're doing this wrong):
I know It seems weird that the task accesses the view using urllib. We do this because the task usually calls one of a series of third party APIs to get info about a product, and if it cannot access these, it accesses our own Django database of products. The code that makes the urllib call is generic code that is agnostic of which case we're dealing with.
These are integration tests so we'd prefer actually make the urllib call rather than mock it out

The celery workers are still feeding off of the dev database even if the test server brings up other databases because they were told to in the settings file.
One fix would be to make a separate settings_test.py file that specifies the test database name and bring up celery workers from the setup command using subprocess.checkoutput that consume from a special queue for testing. Then these celery workers would feed from the test database rather than the dev database.

Related

How can I have an atomic block of code that doesn't access the database in Django?

I am working on a project that uses Django and Django REST Framework. In one of the views there's a method F() that does the following:
Fetches data from the database (read operation)
Sends a create (POST) request to a 3rd party API. (although not local, this is a write operation and this is where a race condition might take place)
Returns JSON data
I'd like F() to be atomic, in other words, if the server receives multiple requests at the same time asking for this view, the server should handle one request at a time and not allow multiple threads to access this block of code simultaneously. How can this be achieved? I have read that Django provides transactions.atomic() but this guarantees atomicity of database transactions, what I need is atomicity for a whole block of code regardless of whether it accesses the database or not.

The concept you are looking for is a "mutex" or a "lock". This article may guide you in the right direction https://lincolnloop.com/blog/distributed-locking-django/

fetch data from 3rd party API - Single Responsibility Principle in Django

What's the most elegant way to fetch data from an external API if I want to be faithful to the Single Responsibility Principle? Where/when exactly should it be made?
Assuming I've got a POST /foo endpoint which after being called should somehow trigger a call to the external API and fetch/save some data from it in my local DB.
Should I add the call in the view? Or the Model?

I usually add any external API calls into dedicated services.py module (same level as your models.py that you're planning to save results into or common app if any of the existing are not logically related)
Inside that module you can use class called smth like MyExtarnalService and add all needed methods for fetching, posting, removing etc. just like you would do with drf api view.
Also remember to handle exceptions properly (timeouts, connection errors, error response codes) by defining custom error exception classes.

Custom access log in aiohttp

I want to have additional attributes of request to be logged in access log of aiohttp server.
For example I have middleware, that adds user attribute to each request, and I want to store this value inside extra attribute for access log records. Documentation suggests to overwrite aiohttp.helpers.AccessLogger which indeed seems to be a good starts, but what do I do next, where do I put instance of my custom logger? I looked through the code and it looks like that is not possible on application creation stage, but rather on application run. But I'm running application using different approaches, so this it's not that convenient to modify startup in several places (for example locally I'm using aiohttp-devtools runserver and gunicorn for deployment).
So what should be correct approach here?
(Also I'd like to do the same for error log, but that seems even more complex, so for now I'm just using another middleware that catches errors and creates log records I need).

Taking into account that development logging config is usually very different from production one keeping two different approaches for gunicorn and aiohttp-devtools are totally fine.
For dev server you perhaps need to log everything on a console, staging and production writes logs differently.

Django tests, transactions and angular protractor

I use django-rest-framework for backend and angularjs for frontend. I started write e2e tests using protractor and faced with a problem, that after each test all changes in database are saved.
In django every test is enclosed in a database transaction that is rolled back at the end of the test. Is there a way to enclose in transaction every protractor test? I know that I can use django live server, python-selenium and write tests in python, but then I lose advantages of protractor.

Unfortunately, there is no universal solution for this problem.
One option is to connect to your database directly from Protractor/Node.js with a database client of your choice and make the necessary database changes before, after or during the tests. You can even use ORMs like sequelize.js as an abstraction layer for your database tables. But, since your backend is not Node.js, having two database abstraction layers in two different languages would probably overcomplicate things.
Or, generally a better way: you can use your Django REST API in the "set up" and "tear down" phases of your Protractor tests to restore/prepare the necessary database state by making requests to the REST API with an HTTP client, please see more at:
Direct Server HTTP Calls in Protractor

Is there a production-safe way to measure time spent in Production w/Python?

I want to be able to instrument Python applications so that I know:
Page generation time.
Percentage of time spent in external requests (mysql, api calls).
Number of mysql queries, what the MySQL queries were.
I want this data from production (not offline profiling) - because the time spent in various places will be different under load.
In PHP I can do this with XHProf or instrumentation-for-php. In Ruby on Rails/.NET/Java, I can do this with New Relic.
Is there such a package recommended for Python or django?

Yes, it's perfectly possible. E.g. use some magic switch in URL, like "?profile-me" which triggers profiling in Django middleware.
There are a number of snippets on the Internet, like this one: http://djangosnippets.org/snippets/70/ or modules like this one: http://code.google.com/p/django-profiling/ - but I haven't used any of them so I cannot recommend anything.
Anyway, the approach they take is similar to what I do - i.e. use Python Hotshot profiler module in a middleware that wraps your view. For the MySQL part, you can just use connection.queries form Django.
The nice thing about Hotshot is that its output can be browsed using Kcachegrind like here: http://www.rkblog.rk.edu.pl/w/p/django-profiling-hotshot-and-kcachegrind/

New Relic now had a package for Python, including Django through mod_wsgi.
https://support.newrelic.com/help/kb/python

django-prometheus is a good choice for handling production workloads, especially in a container environment like Kubernetes. Out of the box, it has middleware for tracking request latencies and counts (by view method), as well as Database and cache access times. It wouldn't be a good solution for tracking which queries are actually executing, but that's where a logging solution like ELK would come into play. If it helps, I've written a post which walks through how to add custom metrics to a Django application.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.