Django ORM not performing SELECT queries in testing

Django ORM not performing SELECT queries in testing - python

I'm experiencing an issue with the Django ORM in our current application after upgrading from Django 1.4 to Django 1.6 recently. This issue is only showing up while running tests, not while running in development or production environments.
More specifically, we have a post_save hook which collects information from tables which subclass our main data object and coalesce the data stored in those objects into a SearchDocument object which is then stored and queried in order to provide full-text searching on the interface. This works in both development and production environments, but when attempting to run unit tests, we have issues where tests are failing.
Our method, which is in the process of being debugged, looks something like this right now:
def _update_search_document( self, doc ):
"""Updates and saves an existing search document for this model.
This will sync the search text and key/value attributes."""
# A bunch of code updating the model properties prior to saving.
doc.save()
print SearchDocument.objects.all()
print connection.queries:
This code saves correctly, but the subsequent call to SearchDocument.objects.all() returns [] and the connection.queries property contains the INSERT SQL from the doc.save() call but does not contain (at any point) SELECT SQL from any of the three extant SearchDocument SELECT queries which should have been performed to that point.
To the best of my knowledge, this code was working prior to our migration from Django 1.4 to Django 1.6. I know that some of the query logic changed between 1.4 and 1.6 and I'm wondering if there's a setting (perhaps surrounding caching?) that's causing problems with our tests now.
Edit: After a bit more hacking, I've been able to determine that moving this test class to a SimpleTestCase instead of the TestCase class which subclasses TransactionalTestCase (which wraps each test in a DB transaction) resolves the issue that I was seeing with items not showing up after they've been saved. It appears that something in that transaction management is causing the problems with the saving.

Quoting my edit because this is how we eventually solved it:
After a bit more hacking, I've been able to determine that moving this test class to a SimpleTestCase instead of the TestCase class which subclasses TransactionalTestCase (which wraps each test in a DB transaction) resolves the issue that I was seeing with items not showing up after they've been saved. It appears that something in that transaction management is causing the problems with the saving.

Related

Managing migration of large number of fixtures in Django for unit testing

I currently have one fixture for all my tests for my Django application. It is fine however updating units tests each time I add a new object to my fixture is tedious.
Objects count and equality of query set have to be updated.
Many "get" methods fail when duplicate appears for various reasons.
Having a dataset filled with every possible use case for each unit test seems like a bad practice.
So I would like to have a fixture for each component of the app to test, e.g. if I have a model class "MyModel", it has a dedicated TestCase all its functionalities have unit tests and I would like them to have a dedicated fixture. The main interest would be that I automatically solve all three points mentioned above.
However, it has some drawbacks
File management, I copy a directory into Django's data directory for my fixture, I would need to manage multiple data directories.
Redundance of some fixtures elements. Many elements rely on the existence of other elements up to the user object, each fixture would need a prefilled database with some common objects (e.g. configured user)
Django migrations.
The two first points are not the real problem but the task of migrating fixtures alongside my codebase looks like hell, It is already hard to manage for one fixture this post explains how to manage code, migrations, database state, and a fixture. But this seems like too much if I have to do it for every fixture of every test.
Is there some clean way out there to migrate fixtures as you migrate the database?
PS: If it matters, I am using Django 3.2 with Django-rest-framework

Caching a static Database table in Django

I have a Django web application that is currently live and receiving a lot of queries. I am looking for ways to optimize its performance and one area that can be improved is how it interacts with its database.
In its current state, each request to a particular view loads an entire database table into a pandas dataframe, against which queries are done. This table consists of over 55,000 rows of text data (co-ordinates mostly).
To avoid needless queries, I have been advised to cache the database into memory and have it be cached upon the first time its loaded. This will remove some overhead on the DB side of things. I've never used this feature of Django before so I am a bit lost.
The Django manual does not seem to have a concrete implementation of what I want to do. Would it be a good idea to just store the entire table in memory or would storing it in a file be a better idea?

I had a similar problem and django-cache-machine worked like a charm. It uses the Django caching features to cache the results of your queries. It is very easy to set up (assuming you have already configured Django's cache backend):
pip install django-cache-machine
Then in the model you want to cache:
from caching.base import CachingManager, CachingMixin
class MyModel(CachingMixin, models.Model):
objects = CachingManager()
And that's it, your queries will be cached.

How do I drop custom types using SQLAlchemy + PostgreSQL?

I have a script which cleans the database, and this is widely used in our tests.
First, we tried to use SQLAlchemy Metadata.drop_all() thing, but it didn't resolve some foreign keys on deletion, which caused errors. Then, I found this script from #zzzeek, which does almost the same, but in a "smart" way. It handles all the issues with foreign keys, but now there are several issues regarding changed custom types (ENUMs). The question is, how can I drop them all them using SQLAlchemy? Execute DROP TYPE by hand only?
Tables in database are created with Alembic, and even while script above deletes all tables successfully, some custom ENUMs are still there and everything fails on attempt to recreate them.
Recreating the whole database is not a preferred solution, because default DB user for application shouldn't normally have rights to create databases.

Are you sure your Metadata instance fully describes all the tables?
Try:
Metadata.reflect()
Metadata.drop_all()

This is an ancient question, but it's still possible to come across this problem if create_type=False is on any postgresql.ENUM definitions.
Per the SQLAlchemy docs on create_type,
When False, no check will be performed and no CREATE TYPE or DROP TYPE is emitted, unless ENUM.create() or ENUM.drop() are called directly.
That means when running tests, while there may be a setup & teardown with create_all() and drop_all(), neither will affect custom enum types.
The solution is simply to remove create_type=False, since True is the default. Then all custom types will be created at the beginning of testing and dropped at the end, resulting in a perfectly clean test database.

Saving SQLAlchemy models to file

I'm coming from EntityFramework. I'm used to creating my db, running EF, and having a bunch of class files generated for each of my db objects (usually tables) filled with properties (usually columns).
Following this basic use example, I've figured out how to use reflection to generate models in memory. But how does one save the models to disk as classes? Since python code isn't compiled I guess the entire ORM could just be generated every time I run my application, but this feels very strange coming from my EF background. What's the best practice here? (BTW I'm using this in the context of Flask).

Generally you reflect them every time you run your app, yes. Otherwise they'd break when you update your schema, which would sort of defeat the point of using reflection.
It is possible to generate declarative classes based on the current state of your schema; there's a sqlacodegen module on PyPI that does this, and the author of SQLA has a database migration project called Alembic that tackles the similar problem of comparing two schemata.

flask-sqlalchemy or sqlalchemy

I am new in both flask and sqlalchemy, I just start working on a flask app, and I am using sqlalchemy for now. I was wondering if there is any significant benefit I can get from using flask-sqlalchemy vs sqlalchemy. I could not find enough motivations in http://packages.python.org/Flask-SQLAlchemy/index.html or maybe I did not understand the value!! I would appreciate your clarifications.

The main feature of the Flask-SQLAlchemy is proper integration with Flask application - it creates and configures engine, connection and session and configures it to work with the Flask app.
This setup is quite complex as we need to create the scoped session and properly handle it according to the Flask application request/response life-cycle.
In the ideal world that would be the only feature of Flask-SQLAlchemy, but actually, it adds few more things. Check out the docs for more info. Or see this blog post with the overview of them: Demystifying Flask-SQLAlchemy (update: the original article is not available at the moment, there is a snapshot on webarchive).
When I first worked with Flask and SQLAlchemy, I didn't like this overhead . I went over and extracted the session management code from the extension. This approach works, although I discovered that it is quite difficult to do this integration properly.
So the easier approach (which is used in another project I am working on) is to just drop the Flask-SQLAlchemy in and don't use any of additional features it provides. You will have the db.session and you can use it as if it was pure SQLAlchemy setup.

Flask-SQLAlchemy gives you a number of nice extra's you would else end up implementing yourself using SQLAlchemy.
Positive sides on using Flask-SQLAlchemy
Flask_SQLAlchemy handles session configuration, setup and teardown for you.
Gives you declarative base model that makes querying and pagination easier
Backend specific settings.Flask-SQLAlchemy scans installed libs for Unicode support and if fails automatically uses SQLAlchemy Unicode.
Has a method called apply_driver_hacks that automatically sets sane defaults to thigs like MySQL pool-size
Has nice build in methods create_all() and drop_all() for creating and dropping all tables. Useful for testing and in python command line if you did something stupid
It gives you get_or_404()instead of get() and find_or_404() instead of find() Code example at > http://flask-sqlalchemy.pocoo.org/2.1/queries/
Automatically set table names. Flask-SQLAlchemy automatically sets your table names converting your ClassName > class_name this can be overridden by setting __tablename__ class
List item
Negative sides on using Flask-SQLAlchemy
Using Flask-SQLAlchemy will make add additional difficulties to for
migrating from Flask to let's say Pyramid if you ever need to. This is mainly due to the custom declarative base model on Flask_SQLAchemy.
Using Flask-SQLAlchemy you risk using a package with a much smaller community than SQLAlchemy itself, which I cannot easily drop from active development any time soon.
Some nice extras Flask-SQLAlchemy has can make you confused if you do not know they are there.

To be honest, I don't see any benefits. IMHO, Flask-SQLAlchemy creates an additional layer you don't really need. In our case we have a fairly complex Flask application with multiple databases/connections (master-slave) using both ORM and Core where, among other things, we need to control our sessions / DB transactions (e.g. dryrun vs commit modes). Flask-SQLAlchemy adds some additional functionality such as automatic destruction of the session assuming some things for you which is very often not what you need.

The SQLAlchemy documentation clearly states that you should use Flask-SQLAlchemy (especially if you don't understand its benefits!):
[...] products such as Flask-SQLAlchemy [...] SQLAlchemy strongly recommends that these products be used as available.
This quote and a detailed motivation you can find in the second question of the Session FAQ.

As #schlamar suggests, Flask-SqlAlchemy is definitely a good thing. I'd just like to add some extra context to the point made there.
Don't feel like you are choosing one over the other. For example, let's say we want to grab all records from a table using a model using Flask-Sqlalchemy. It is as simple as
Model.query.all()
For a lot of the simple cases, Flask-Sqlalchemy is going to be totally fine. The extra point that I would like to make is, if Flask-Sqlalchemy is not going to do what you want, then there's no reason you can't use SqlAlchemy directly.
from myapp.database import db
num_foo = db.session.query(func.count(OtherModel.id)).filter(is_deleted=False).as_scalar()
db.session.query(Model.id, num_foo.label('num_foo')).order_by('num_foo').all()
As you can see, we can easily jump from one to the other with no trouble and in the second example we are in fact using the Flask-Sqlalchemy defined models.

Here is an example of a benefit flask-sqlalchemy gives you over plain sqlalchemy.
Suppose you're using flask_user.
flask_user automates creation and authentication of user objects, so it needs to access your database. The class UserManager does this by calling through to something called an "adapter" which abstracts the database calls. You provide an adapter in the UserManager constructor, and the adapter must implement these functions:
class MyAdapter(DBAdapter):
def get_object(self, ObjectClass, id):
""" Retrieve one object specified by the primary key 'pk' """
pass
def find_all_objects(self, ObjectClass, **kwargs):
""" Retrieve all objects matching the case sensitive filters in 'kwargs'. """
pass
def find_first_object(self, ObjectClass, **kwargs):
""" Retrieve the first object matching the case sensitive filters in 'kwargs'. """
pass
def ifind_first_object(self, ObjectClass, **kwargs):
""" Retrieve the first object matching the case insensitive filters in 'kwargs'. """
pass
def add_object(self, ObjectClass, **kwargs):
""" Add an object of class 'ObjectClass' with fields and values specified in '**kwargs'. """
pass
def update_object(self, object, **kwargs):
""" Update object 'object' with the fields and values specified in '**kwargs'. """
pass
def delete_object(self, object):
""" Delete object 'object'. """
pass
def commit(self):
pass
If you're using flask-sqlalchemy, you can use the built-in SQLAlchemyAdapter. If you're using sqlalchemy (not-flask-sqlalchemy) you might make different assumptions about the way in which objects are saved to the database (like the names of the tables) so you'll have to write your own adapter class.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.