Can someone show me how to write unit tests for sqlalchemy model I created using nose.
I just need one simple example.
Thanks.
You can simply create an in-memory SQLite database and bind your session to that.
Example:
from db import session # probably a contextbound sessionmaker
from db import model
from sqlalchemy import create_engine
def setup():
engine = create_engine('sqlite:///:memory:')
session.configure(bind=engine)
# You probably need to create some tables and
# load some test data, do so here.
# To create tables, you typically do:
model.metadata.create_all(engine)
def teardown():
session.remove()
def test_something():
instances = session.query(model.SomeObj).all()
eq_(0, len(instances))
session.add(model.SomeObj())
session.flush()
# ...
Check out the fixture project. We used nose to test that and it's also a way to declaratively define data to test against, there will be some extensive examples for you to use there!
See also fixture documentation.
Related
I am running tests on some functions. I have a function that uses database queries. So, I have gone through the blogs and docs that say we have to make an in memory or test database to use such functions. Below is my function,
def already_exists(story_data,c):
# TODO(salmanhaseeb): Implement de-dupe functionality by checking if it already
# exists in the DB.
c.execute("""SELECT COUNT(*) from posts where post_id = ?""", (story_data.post_id,))
(number_of_rows,)=c.fetchone()
if number_of_rows > 0:
return True
return False
This function hits the production database. My question is that, when in testing, I create an in memory database and populate my values there, I will be querying that database (test DB). But I want to test my already_exists() function, after calling my already_exists function from test, my production db will be hit. How do I make my test DB hit while testing this function?
There are two routes you can take to address this problem:
Make an integration test instead of a unit test and just use a copy of the real database.
Provide a fake to the method instead of actual connection object.
Which one you should do depends on what you're trying to achieve.
If you want to test that the query itself works, then you should use an integration test. Full stop. The only way to make sure the query as intended is to run it with test data already in a copy of the database. Running it against a different database technology (e.g., running against SQLite when your production database in PostgreSQL) will not ensure that it works in production. Needing a copy of the database means you will need some automated deployment process for it that can be easily invoked against a separate database. You should have such an automated process, anyway, as it helps ensure that your deployments across environments are consistent, allows you to test them prior to release, and "documents" the process of upgrading the database. Standard solutions to this are migration tools written in your programming language like albemic or tools to execute raw SQL like yoyo or Flyway. You would need to invoke the deployment and fill it with test data prior to running the test, then run the test and assert the output you expect to be returned.
If you want to test the code around the query and not the query itself, then you can use a fake for the connection object. The most common solution to this is a mock. Mocks provide stand ins that can be configured to accept the function calls and inputs and return some output in place of the real object. This would allow you to test that the logic of the method works correctly, assuming that the query returns the results you expect. For your method, such a test might look something like this:
from unittest.mock import Mock
...
def test_already_exists_returns_true_for_positive_count():
mockConn = Mock(
execute=Mock(),
fetchone=Mock(return_value=(5,)),
)
story = Story(post_id=10) # Making some assumptions about what your object might look like.
result = already_exists(story, mockConn)
assert result
# Possibly assert calls on the mock. Value of these asserts is debatable.
mockConn.execute.assert_called("""SELECT COUNT(*) from posts where post_id = ?""", (story.post_id,))
mockConn.fetchone.assert_called()
The issue is ensuring that your code consistently uses the same database connection. Then you can set it once to whatever is appropriate for the current environment.
Rather than passing the database connection around from method to method, it might make more sense to make it a singleton.
def already_exists(story_data):
# Here `connection` is a singleton which returns the database connection.
connection.execute("""SELECT COUNT(*) from posts where post_id = ?""", (story_data.post_id,))
(number_of_rows,) = connection.fetchone()
if number_of_rows > 0:
return True
return False
Or make connection a method on each class and turn already_exists into a method. It should probably be a method regardless.
def already_exists(self):
# Here the connection is associated with the object.
self.connection.execute("""SELECT COUNT(*) from posts where post_id = ?""", (self.post_id,))
(number_of_rows,) = self.connection.fetchone()
if number_of_rows > 0:
return True
return False
But really you shouldn't be rolling this code yourself. Instead you should use an ORM such as SQLAlchemy which takes care of basic queries and connection management like this for you. It has a single connection, the "session".
from sqlalchemy import create_engine
from sqlalchemy.orm import sessionmaker
from sqlalchemy_declarative import Address, Base, Person
engine = create_engine('sqlite:///sqlalchemy_example.db')
Base.metadata.bind = engine
DBSession = sessionmaker(bind=engine)
session = DBSession()
Then you use that to make queries. For example, it has an exists method.
session.query(Post.id).filter(q.exists()).scalar()
Using an ORM will greatly simplify your code. Here's a short tutorial for the basics, and a longer and more complete tutorial.
I would like to create a custom method to tell me whether a row exists in my database. The SQLAlchemy exists method returns a subquery but doesn't execute anything, so I wanted to create the method does_exist, which will simply return True or False. Here is my code:
from flask_sqlalchemy import SQLAlchemy, BaseQuery, Model
class CustomBaseQuery(BaseQuery):
def does_exist(self):
return db.session.query(self.exists()).scalar()
db = SQLAlchemy(app, query_class=CustomBaseQuery)
This actually does work, but it seems wrong to refer to db.session within the body of the method, which thus depends on later naming the SQLAlchemy instance db. I would like to find a way to reference the eventual db.session object in a more general way.
Full working example here: https://gist.github.com/wbruntra/3db7b630e6ffb86fe792e4ed5a7a9578
Though undocumented, the session used by the Query object is accessible as
self.session
so your more generic CustomBaseQuery could look like
class CustomBaseQuery(BaseQuery):
def does_exist(self):
return self.session.query(self.exists()).scalar()
Suppose I'm following the Flask-SQLAlchemy quickstart example, and I want to add a couple of unittests.
My model might look something like:
db = SQLAlchemy(app)
Base = db.Model
class Account(Base):
id = Column(Integer, primary_key=True)
name = Column(String(1000))
For unittesting, I'll want to create and destroy a database for each test.
def _setup_database():
db_name = 'test_%s' % random.randint(0,999999)
# (Code that creates database with db_name and setups the schema)
app.config['DB_NAME'] = db_name
app.config['SQLALCHEMY_DATABASE_URI'] = 'postgresql:///{}'.format(db_name)
def _destroy_db():
db_name = app.config['DB_NAME']
# Code that destroys the test db
class TestAccounts(unittest.TestCase):
def setUp(self):
_setup_database()
def tearDown(self):
_destroy_db()
def test_new_signup(self):
account = models.Account(...)
def test_something_else(self):
account = models.Account(...)
The problem here is that, if I run this code in an environment where unittests are multi-threaded, there is a race condition. Two databases are usually setup simultaneously and app.config is pointed to one of them.
If I were using SQLAlchemy directly, I would create a new session for each test and use that session. But Flask-SQLAlchemy creates sessions for me, and as such, seems to depend on having one global app.config['SQLALCHEMY_DATABASE_URI'] to point to a database.
What's the right way to create test databases and point a test thread to them with Flask-SQLAlchemy?
With unittest TestCase.setUp and TestCase.tearDown are run for each test_ functions.
So running your test in multithreaded process will indeed create a RC. You need to either run your test in a single thread. Which will fix the RC, but you'll still create and destroy the all database for each test which is unnecessary and slow.
A better solution is to use the setUpclass and tearDownClass methods, which only run once per test class.
If you need to run this fixtures for all classes in a module there is also setUpModule and tearDownModule.
If you need to set fixtures for the full session now you have a problem... IMHO that is the time to switch to pytest.
We have distilled a situation down to the following:
import pytest
from django.core.management import call_command
from foo import bar
#pytest.fixture(scope='session')
def django_db_setup(django_db_setup, django_db_blocker):
LOGGER.info('ran call_command')
with django_db_blocker.unblock():
call_command('loaddata', 'XXX.json')
#pytest.mark.django_db(transaction=True)
def test_t1():
assert len(bar.objects.all())
#pytest.mark.django_db(transaction=True)
def test_t2():
assert len(bar.objects.all())
The test fixture XXX.json includes one bar. The first test (test_t1) succeeds. The second test (test_t2) fails. It appears that the transaction=True attribute does not result in the database being reinitialized with the data from the test fixture.
If TransactionTestCase from unittest is used instead, the initialization happens before every test case in the class and all tests succeed.
from django.test import TransactionTestCase
from foo import bar
class TestOne(TransactionTestCase):
fixtures = ['XXX.json']
def test_tc1(self):
assert len(bar.objects.all())
def test_tc2(self):
assert len(bar.objects.all())
objs = bar.objects.all()
for bar in objs:
bar.delete()
def test_tc3(self):
assert len(bar.objects.all())
I would appreciate any perspectives on why the pytest example doesn't result in a reinitialized database for the second test case.
The django_db_setup is session scoped, and therefore only run once at the beginning of the test session. When using transaction=True, the database gets flushed after every test (including the first) and so any data added in django_db_setup is removed.
TransactionTestCase obviously knows that it is using transactions and because it is a django thing it knows that it needs to re-add the fixtures for each test, but pytest in general is not aware of django's needs, and so it has no way to know that it needs to re-run your fixture django_db_setup – as far as it's concerned it only needs to run that once since it is session scoped.
You have the following options:
use a lower scoped fixture, probably to the function scope as suggested in the comments. But this will probably be opt-in, and this will be run within the transaction, so will be removed after the test is complete.
Write a fixture that is smart / django-aware, and knows when it needs to re-populate that data by detecting when the test is using transactions. But you need to ensure that the database connection being used is not in a transaction. I have done this on django 1.11 and it works fine, although it may need fixing after an upgrade. Looks something like this:
from unittest.mock import patch
from django.core.management import call_command
from django.db import DEFAULT_DB_ALIAS, ConnectionHandler
import pytest
_need_data_load = True
#pytest.fixture(autouse=True)
def auto_loaddata(django_db_blocker, request):
global _need_data_load
if _need_data_load:
# Use a separate DB connection to ensure we're not in a transaction.
con_h = ConnectionHandler()
try:
def_con = con_h[DEFAULT_DB_ALIAS]
# we still need to unblock the database because that's a test level
# constraint which simply monkey patches the database access methods
# in django to prevent access.
#
# Also note here we need to use the correct connection object
# rather than any default, and so I'm assuming the command
# imports `from django.db import connection` so I can swap it.
with django_db_blocker.unblock(), patch(
'path.to.your.command.modules.connection', def_con
):
call_command('loaddata')
finally:
con_h.close_all()
_need_auto_sql = False
using_transactional_db = (
'transactional_db' in request.fixturenames
or 'live_server' in request.fixturenames
)
if using_transactional_db:
# if we're using a transactional db then we will dump the whole thing
# on teardown, so need to flag that we should set it up again after.
_need_data_load = True
Using a non-declarative SQLAlchemy setup, can I use an association_proxy without having to modify my model class?
Preliminary hack:
(in the module where the SQLAlchemy mapping etcetera takes place:)
import ModelClass
ModelClass.associationProperty = association_proxy()
mapper() #etc..
This works but I am not sure how reliable this is.