I am developing a CherryPy application and I want to write some automated tests for it. I chose to use nosetests for it. The application uses sqlalchemy as db backend so I need to use fixture package to provide fixed datasets. Also I want to do webtests. Here is how I set it all together:
I have a helper function init_model(test = False) in the file where all models are created. It connects to the production or test (if test == True or cherrypy.request.app.test == True) database and calls create_all
Then I have created a base class for tests like this:
class BaseTest(DataTestCase):
def __init__(self):
init_model(True)
application.test = True
self.app = TestApp(application)
self.fixture = SQLAlchemyFixture(env = models, engine = meta.engine, style = NamedDataStyle())
self.datasets = (
# all the datasets go here
)
And now I do my tests by creating child classes of BaseTest and calling self.app.some_method()
This is my first time doing tests in python and all this seems very complicated. I want to know if I am using the mentioned packages as their authors intended and if it's not overcomplicated.
That looks mostly like normal testing glue for a system of any size. In other words, it's not overly-complicated.
In fact, I'd suggest slightly more complexity in one respect: I think you're going to find setting up a new database in each child test class to be really slow. It's more common to at least set up all your tables once per run instead of once per class. Then, you either have each test method create all the data it needs for its own sake, and/or you run each test case in a transaction and roll it all back in a finally: block.
Related
I am running tests on some functions. I have a function that uses database queries. So, I have gone through the blogs and docs that say we have to make an in memory or test database to use such functions. Below is my function,
def already_exists(story_data,c):
# TODO(salmanhaseeb): Implement de-dupe functionality by checking if it already
# exists in the DB.
c.execute("""SELECT COUNT(*) from posts where post_id = ?""", (story_data.post_id,))
(number_of_rows,)=c.fetchone()
if number_of_rows > 0:
return True
return False
This function hits the production database. My question is that, when in testing, I create an in memory database and populate my values there, I will be querying that database (test DB). But I want to test my already_exists() function, after calling my already_exists function from test, my production db will be hit. How do I make my test DB hit while testing this function?
There are two routes you can take to address this problem:
Make an integration test instead of a unit test and just use a copy of the real database.
Provide a fake to the method instead of actual connection object.
Which one you should do depends on what you're trying to achieve.
If you want to test that the query itself works, then you should use an integration test. Full stop. The only way to make sure the query as intended is to run it with test data already in a copy of the database. Running it against a different database technology (e.g., running against SQLite when your production database in PostgreSQL) will not ensure that it works in production. Needing a copy of the database means you will need some automated deployment process for it that can be easily invoked against a separate database. You should have such an automated process, anyway, as it helps ensure that your deployments across environments are consistent, allows you to test them prior to release, and "documents" the process of upgrading the database. Standard solutions to this are migration tools written in your programming language like albemic or tools to execute raw SQL like yoyo or Flyway. You would need to invoke the deployment and fill it with test data prior to running the test, then run the test and assert the output you expect to be returned.
If you want to test the code around the query and not the query itself, then you can use a fake for the connection object. The most common solution to this is a mock. Mocks provide stand ins that can be configured to accept the function calls and inputs and return some output in place of the real object. This would allow you to test that the logic of the method works correctly, assuming that the query returns the results you expect. For your method, such a test might look something like this:
from unittest.mock import Mock
...
def test_already_exists_returns_true_for_positive_count():
mockConn = Mock(
execute=Mock(),
fetchone=Mock(return_value=(5,)),
)
story = Story(post_id=10) # Making some assumptions about what your object might look like.
result = already_exists(story, mockConn)
assert result
# Possibly assert calls on the mock. Value of these asserts is debatable.
mockConn.execute.assert_called("""SELECT COUNT(*) from posts where post_id = ?""", (story.post_id,))
mockConn.fetchone.assert_called()
The issue is ensuring that your code consistently uses the same database connection. Then you can set it once to whatever is appropriate for the current environment.
Rather than passing the database connection around from method to method, it might make more sense to make it a singleton.
def already_exists(story_data):
# Here `connection` is a singleton which returns the database connection.
connection.execute("""SELECT COUNT(*) from posts where post_id = ?""", (story_data.post_id,))
(number_of_rows,) = connection.fetchone()
if number_of_rows > 0:
return True
return False
Or make connection a method on each class and turn already_exists into a method. It should probably be a method regardless.
def already_exists(self):
# Here the connection is associated with the object.
self.connection.execute("""SELECT COUNT(*) from posts where post_id = ?""", (self.post_id,))
(number_of_rows,) = self.connection.fetchone()
if number_of_rows > 0:
return True
return False
But really you shouldn't be rolling this code yourself. Instead you should use an ORM such as SQLAlchemy which takes care of basic queries and connection management like this for you. It has a single connection, the "session".
from sqlalchemy import create_engine
from sqlalchemy.orm import sessionmaker
from sqlalchemy_declarative import Address, Base, Person
engine = create_engine('sqlite:///sqlalchemy_example.db')
Base.metadata.bind = engine
DBSession = sessionmaker(bind=engine)
session = DBSession()
Then you use that to make queries. For example, it has an exists method.
session.query(Post.id).filter(q.exists()).scalar()
Using an ORM will greatly simplify your code. Here's a short tutorial for the basics, and a longer and more complete tutorial.
Suppose I'm following the Flask-SQLAlchemy quickstart example, and I want to add a couple of unittests.
My model might look something like:
db = SQLAlchemy(app)
Base = db.Model
class Account(Base):
id = Column(Integer, primary_key=True)
name = Column(String(1000))
For unittesting, I'll want to create and destroy a database for each test.
def _setup_database():
db_name = 'test_%s' % random.randint(0,999999)
# (Code that creates database with db_name and setups the schema)
app.config['DB_NAME'] = db_name
app.config['SQLALCHEMY_DATABASE_URI'] = 'postgresql:///{}'.format(db_name)
def _destroy_db():
db_name = app.config['DB_NAME']
# Code that destroys the test db
class TestAccounts(unittest.TestCase):
def setUp(self):
_setup_database()
def tearDown(self):
_destroy_db()
def test_new_signup(self):
account = models.Account(...)
def test_something_else(self):
account = models.Account(...)
The problem here is that, if I run this code in an environment where unittests are multi-threaded, there is a race condition. Two databases are usually setup simultaneously and app.config is pointed to one of them.
If I were using SQLAlchemy directly, I would create a new session for each test and use that session. But Flask-SQLAlchemy creates sessions for me, and as such, seems to depend on having one global app.config['SQLALCHEMY_DATABASE_URI'] to point to a database.
What's the right way to create test databases and point a test thread to them with Flask-SQLAlchemy?
With unittest TestCase.setUp and TestCase.tearDown are run for each test_ functions.
So running your test in multithreaded process will indeed create a RC. You need to either run your test in a single thread. Which will fix the RC, but you'll still create and destroy the all database for each test which is unnecessary and slow.
A better solution is to use the setUpclass and tearDownClass methods, which only run once per test class.
If you need to run this fixtures for all classes in a module there is also setUpModule and tearDownModule.
If you need to set fixtures for the full session now you have a problem... IMHO that is the time to switch to pytest.
I'm currently working on a system which processes a user specified list (in a file) of metrics; a metric is a collection of some strings and some logic as to how to perform a certain calculation. A python script is periodically run on a linux server and consults the file.
Metrics should be easily extensible, so I'm looking for some modular way to encapsulate the code for the calculation into the specification of a metric. In short of creating a config parsing system, which would severely limit the complexity of the calculations, I'm looking for a plugin system where the calculation of each metric can be specified in python code.
For example, here are the core units of a metric.
db = "database0"
mes = "measurement0"
tags = ["alpha", "beta", "delta"]
calculation: (accepts parameters a and b)
for item in a:
b.add(a)
return b
The python script needs to grab each metric from file, access their string fields and execute their calculation code, passing the same parameters to each.
I currently have a terrible method; in a separate python file, I define a base class which is to be extended by each new metric. This is what it looks like
class Metric(object):
def __init__(self, database_name, measurement_name, classad_tags, classad_fields=[]):
"""
A specification of a metric.
Arguments:
database_name - name of the influx DB (created if doesn't exist)
measurement_name - measurement name with which to label metric in DB
classad_tags - list of classad fields (or mock ads) which will
segregate values at a time for this metric, becoming
tags in the influxDB measurement
classad_fields - any additional job classad fields that this metric will
look at (e.g. for metric value calculation).
These must be declared so that the daemon can fetch any
needed classads from condor
"""
self.db = database_name
self.mes = measurement_name
self.tags = classad_tags
self.fields = classad_fields
def calculate_at_bin(self, time_bin, jobs):
"""
"""
raise ReferenceError("")
"""
--------------------------------------------------------------------------------------
"""
class metric0(Metric):
def __init__(self):
db = 'testdb'
mes = 'testmes0'
tags = ['Owner']
fields = []
super(metric0, self).__init__(db, mes, tags, fields)
def calculate_at_bin(self, time_bin, jobs):
for job in jobs:
if job.is_idle_during(time_bin.start_time, time_bin.end_time):
time_bin.add_to_sum(1, job.get_values(self.tags))
return time_bin.get_sum()
To create a new metric, you create a new subclass of Metric, passing the string fields to the super constructor and overriding the calculate_at_bin method. In the main python script, I create an instance of every metric with something like...
metrics = [metric() for metric in vars()['Metric'].__subclasses__()]
However, this is obviously terrible.
It involves things like completely arbitrary but requiredly unique subclass names (metric0) and code which isn't really relevant to the metric (e.g. class definitions, calling the constructor, etc).
I could imagine also another terrible method, like a config file with some code which gets eval'd, but I'd want to avoid that too.
Note that there are no concerns about security or code injection; this system will really only be used by a handful of people that I can instruct in person.
So; how should I go about this?
Have you considered making the code of a metric an importable module, offering a calculate_at_bin function as the "interface", place all such modules in a directory and then import them dynamically? Would remove the issues with the possible name clashes and if security is no concern it would be at least very simple to handle...
I am trying to do a simple test of a model. I insert and retrieve the model and check all that data I inserted with is present. I expect this test to fail with a simple, blank model, but it passes. Is this a quirk of the testing framework that I have to live with? Can I set an option to prevent it from keeping refs to python objects?
In the following, I expect it to fail at line 30, but it does not. It fails at the ref comparison as I insists the refs be different and they are not..
import unittest
from google.appengine.ext import ndb
from google.appengine.ext import testbed
class Action(ndb.Model): pass
class ActionTestCase(unittest.TestCase):
def setUp(self):
# First, create an instance of the Testbed class.
self.testbed = testbed.Testbed()
# Then activate the testbed, which prepares the service stubs for use.
self.testbed.activate()
self.testbed.init_datastore_v3_stub()
self.testbed.init_memcache_stub()
def tearDown(self):
self.testbed.deactivate()
def testFetchRedirectAttribute(self):
act = Action()
act.attr = 'test phrase'
act.put()
self.assertEquals(1, len(Action.query().fetch(2)))
fetched = Action.query().fetch(2)[0]
self.assertEquals(fetched.attr, act.attr)
self.assertTrue(act != fetched)
if __name__ == '__main__':
unittest.main()
Models are defined as being equal if all of their properties are equal. If you care about identity instead (you probably shouldn't...), then you can use assertIs in your test.
As it turns out, storing refs is the behavior of stubs. However, for TDD purposes, we do need to check if a property is defined in the model. The simple way to do so is to use keyword argument. If I write the test as follows, then it fails as expected.
def testFetchRedirectAttribute(self):
act = Action(attr='test phrase)
act.put()
The solved my immediate problem of having a failing that that I could code against.
I have the following django test case that is giving me errors:
class MyTesting(unittest.TestCase):
def setUp(self):
self.u1 = User.objects.create(username='user1')
self.up1 = UserProfile.objects.create(user=self.u1)
def testA(self):
...
def testB(self):
...
When I run my tests, testA will pass sucessfully but before testB starts, I get the following error:
IntegrityError: column username is not unique
It's clear that it is trying to create self.u1 before each test case and finding that it already exists in the Database. How do I get it to properly clean up after each test case so that subsequent cases run correctly?
setUp and tearDown methods on Unittests are called before and after each test case. Define tearDown method which deletes the created user.
class MyTesting(unittest.TestCase):
def setUp(self):
self.u1 = User.objects.create(username='user1')
self.up1 = UserProfile.objects.create(user=self.u1)
def testA(self):
...
def tearDown(self):
self.up1.delete()
self.u1.delete()
I would also advise to create user profiles using post_save signal unless you really want to create user profile manually for each user.
Follow-up on delete comment:
From Django docs:
When Django deletes an object, it
emulates the behavior of the SQL
constraint ON DELETE CASCADE -- in
other words, any objects which had
foreign keys pointing at the object to
be deleted will be deleted along with
it.
In your case, user profile is pointing to user so you should delete the user first to delete the profile at the same time.
If you want django to automatically flush the test database after each test is run then you should extend django.test.TestCase, NOT django.utils.unittest.TestCase (as you are doing currently).
It's good practice to dump the database after each test so you can be extra-sure you're tests are consistent, but note that your tests will run slower with this additional overhead.
See the WARNING section in the "Writing Tests" Django Docs.
Precisely, setUp exists for the very purpose of running once before each test case.
The converse method, the one that runs once after each test case, is named tearDown: that's where you delete self.u1 etc (presumably by just calling self.u1.delete(), unless you have supplementary specialized clean-up requirements in addition to just deleting the object).