Django: using same test database in a separate thread

Django: using same test database in a separate thread - python

I am running pytests using a test database with the following DB settings.
DATABASES = {
'default': {
'ENGINE': 'django.db.backends.postgresql_psycopg2',
'NAME': 'postgres',
'USER': 'something',
'PASSWORD': 'password',
},
}
Using the #pytest.mark.django_db, my test functions access a database called 'test_postgres' created for the tests.
#pytest.mark.django_db
def test_example():
from django.db import connection
cur_ = connection.cursor()
print cur_.db.settings_dict
outputs:
{'ENGINE': 'django.db.backends.postgresql_psycopg2', 'AUTOCOMMIT': True, 'ATOMIC_REQUESTS': False, 'NAME': 'test_postgres', 'TEST_MIRROR': None,...
but if I run a thread inside test_example:
def function_to_run():
from django.db import connection
cur_ = connection.cursor
logger.error(cur_.db.settings_dict)
#pytest.mark.django_db
def test_example():
p = multiprocessing.Process(target=function_to_run)
p.start()
I can see that in that thread the cursor is using database named 'postgres' which is the non-testing database. Output:
{'ENGINE': 'django.db.backends.postgresql_psycopg2', 'AUTOCOMMIT': True, 'ATOMIC_REQUESTS': False, 'NAME': 'postgres', 'TEST_MIRROR': None,...
Is there a way to pass a database connection argument to my thread from the original test function and tell my thread routine to use the same database name ('test_postgres') as my test function?

I found a workaround to my problem.
First you prepare a separate Django settings file for testing (settings_pytest.py), with the following DATABASES setting:
DATABASES = {
'default': {
'ENGINE': 'django.db.backends.postgresql_psycopg2',
'NAME': 'test_database',
'TEST_NAME': 'test_database',
'USER': 'something',
'PASSWORD': 'password',
},
}
Notice that we define TEST_NAME, and it's the same as NAME, so that running through test runner or not, we will be accessing same database.
Now you need to create this database, and run 'syncdb' and 'migrate' on it first:
sql> CREATE DATABASE test_database;
manage.py syncdb --settings=settings_pytest
manage.py migrate --settings=settings_pytest
Finally you can run your tests with:
py.test --reuse-db
You need to specify --reuse-db, database re-creation will never work since the default database is the same as the test database. If there are changes to your database you will need to recreate the database manually with the commands above.
For the test itself, if you are adding records to the database that you need to be accessed by the spawned child process, remember to add transaction=True to the pytest decorator.
def function_to_run():
Model.objects.count() == 1
#pytest.mark.django_db(transaction=True)
def test_example():
obj_ = Model()
obj_.save()
p = multiprocessing.Process(target=function_to_run)
p.start()

In your function_to_run() declaration you're doing from django.db import connection. Are you sure that will be using the correct test db settings? I suspect the decorator you're using modifies the connection import to use the test_postgres rather than postgres but because you're importing outside of the decorators scope it's not using the right one. What happens if you put it inside the decorator-wrapped function like so...
#pytest.mark.django_db
def test_example():
def function_to_run():
from django.db import connection
cur_ = connection.cursor
logger.error(cur_.db.settings_dict)
p = multiprocessing.Process(target=function_to_run)
p.start()
Edit:
I'm not familiar with pytest_django so I'm shooting in the dark at this point, I imagine that the marker function allows you to decorate a class as well, so have you tried putting all the tests that want to use this shared function and the db in one TestCase class? Like so:
from django.test import TestCase
#pytest.mark.django_db
class ThreadDBTests(TestCase):
# The function we want to share among tests
def function_to_run():
from django.db import connection
cur_ = connection.cursor
logger.error(cur_.db.settings_dict)
# One of our tests taht needs the db
def test_example1():
p = multiprocessing.Process(target=function_to_run)
p.start()
# Another test that needs the DB
def test_example2():
p = multiprocessing.Process(target=function_to_run)
p.start()

Related

Django ORM using external database to unmanaged model in view says relation does not exist

I have two Postgres database connections and when using another than default ORM calls fail on views, but not raw. I am using docker and containers are connected and using Django's own runserver command. Using ORM in django command or django shell works fine, but not in a view.
As a side note, both databases are actually Django-projects, but main project is reading some data directly from another project's database using own unmanaged model.
Python: 3.9.7
Django: 3.2.10
Postgres: 13.3 (main project), 12.2 (side project)
# settings
DATABASES = {
'default': {
'ENGINE': 'django.db.backends.postgresql',
'USER': 'pguser',
'PASSWORD': 'pguser',
'NAME': 'mainproject',
'HOST': 'project-db', # docker container
'PORT': '5432',
},
'external': {
'ENGINE': 'django.db.backends.postgresql',
'USER': 'pguser',
'PASSWORD': 'pguser',
'NAME': 'sideproject',
'HOST': 'side-db', # docker container, attached to same network
'PORT': '5432',
},
}
# My unmanaged model
class MyTestModel(models.Model):

class Meta:
# table does not exist in 'default', but it does exist in 'external'
db_table = 'my_data_table'
managed = False
# my_data_table is very simple it has id field as integer
# and primary key (works out of the box with django)
# it has normal fields only like IntegerField or CharField
# But even if this model is empty then it returns normally the PK field
# My custom command in Main project
# python manage.py mycommand
# ...
def handle(self, *args, **options):
# This works fine and data from external is populated in MyTestModel
data = MyTestModel.objects.using('external').all()
for x in data:
print(x, vars(x))
# My simple view in Main project
class MyView(TemplateView):
# But using example in views.py, random view class:
def get(self, request, *args, **kwargs):
# with raw works
data = MyTestModel.objects.using('external').raw('select * from my_data_table')
for x in data:
print(x, vars(x))
# This does not work
# throws ProgrammingError: relation "my_data_table" does not exist
data = MyTestModel.objects.using('external').all()
for x in data:
print(x, vars(x))
return super().get(request, *args, **kwargs)
So somehow runserver and views does not generate query correctly when using ORM. It cannot be a connection error, because when running command or view with ".raw()" works.
Now the funny thing is that if I change "db_table" to something what is common in both database lets say "django_content_type" ORM "filter()" and "all()" works in view too. And yes then it actually returns data from correct database. So if main project has 50 content types and side project (external) has 100 content types it actually returns those 100 content types from external.
I have tried everything, rebuilding docker, creating new tables to database directly, force reload postgres user and made sure all is owner permissions and all permissions (should be ok anyway because command side works). I even tried to use another database.

I know I didn't post the full settings and my local settings which could have helped more to solve this case.
But I noticed that I had installed locally Django Silk, which captures the request and tries to analyze called database queries. Looks like it may have been loaded too early or it doesn't like external databases. But disabling Django silk from installed apps and removing it's middleware removed the problem.

Django testing: Got an error creating the test database: database "database_name" already exists

I have a problem with testing. It's my first time writing tests and I have a problem.
I just created a test folder inside my app users, and test_urls.py for testing the urls.
When I type:
python manage.py test users
It says:
Creating test database for alias 'default'... Got an error creating
the test database: database "database_name" already exists
Type 'yes' if you would like to try deleting the test database
'database_name', or 'no' to cancel:
What does it mean? What happens if I type yes? Do I lose all my data in database?

When testing, Django creates a test database to work on so that your development database is not polluted. The error message says that Django is trying to create a test database named "database_name" and that this database already exists. You should check the tables of the database software you are using and check what is in database_name, it's probably been created by mistake.
If you type yes, the database database_name will be deleted and it is unlikely that you will be able to recover the data. So try to understand what is going on first.
You should set the name of the test database in settings.py. There is a specific TEST dictionary in the DATABASE settings for this:
settings.py
...
DATABASES = {
'default': {
'ENGINE': 'django.db.backends.postgresql',
'USER': 'mydatabaseuser',
'NAME': 'mydatabase',
'TEST': {
'NAME': 'mytestdatabase',
},
},
}
...
By default, the prefix test_ is added to the name of your development database. You should check your settings.py to check what is going on.
From the docs:
The default test database names are created by prepending test_ to the value of each NAME in DATABASES. When using SQLite, the tests will use an in-memory database by default (i.e., the database will be created in memory, bypassing the filesystem entirely!). The TEST dictionary in DATABASES offers a number of settings to configure your test database. For example, if you want to use a different database name, specify NAME in the TEST dictionary for any given database in DATABASES.

FWIW, in the event that you get such a warning when using the --keepdb argument such as
python manage.py test --keepdb [appname]
then this would typically mean that multiple instances of the Client were instantiated, perhaps one per test. The solution is to create one client for the test class and refer to it in all corresponding methods like so:
from django.test import TestCase, Client
class MyTest(TestCase):
def setUp(self):
self.client = Client()
def test_one(self):
response = self.client.get('/path/one/')
# assertions
def test_two(self):
response = self.client.post('/path/two/', {'some': 'data'})
# assertions
You could also (unverified) create a static client using the setUpClass class method.

How can I prevent pytest to remove database records between test cases?

I use pre-created postgres database for my tests. Here the pytest setup:
pytest.ini:
[pytest]
norecursedirs = frontend static .svn _build tmp*
DJANGO_SETTINGS_MODULE = src.settings.testing
addopts = --reuse-db
testing.py:
from .base import *
DEBUG = True
DATABASES = {
'default': {
'ENGINE': 'django.contrib.gis.db.backends.postgis',
'NAME': 'db',
'USER': 'root',
'PASSWORD': 'pass',
'HOST': 'localhost',
'PORT': '5432',
}
}
test fixtures:
#pytest.fixture(scope='session')
def user():
return User.objects.create(name='Test', )
test cases:
import pytest
pytestmark = pytest.mark.django_db
def test_user(user):
print(user.pk) # returns pk of newly created user
print(User.objects.all()) # returns queryset with one user
def test_user2(user):
print(user.pk) # returns the same value as in the previous test
print(User.objects.all()) # returns empty queryset
I can't understand behavior of pytest fixtures. Model instance is created once per session and it is the same in several test cases. But actual db value is different. Pytest remove the user value after the first test case.
How can I prevent that behavior and keep my db records saved for all test session?

It's not a problem of --reuse-db since the user is removed from one test to the next within the same test run.
The problem is you're setting up the fixture by to have a session scope, which means the fixture will be executed once per test run, and since Django will flush the database between tests your User instance is no longer available for the second test. Simply remove the scope from the fixture decorator:
#pytest.fixture()
def user():
return User.objects.create(username='Test')
Edit: From the pytest-django docs "Once setup the database is cached for used for all subsequent tests and rolls back transactions to isolate tests from each other. This is the same way the standard Django TestCase uses the database."
I don't see why you'd want to use the exact same User instance between tests, even if you were to mutate that particular instance it would mean the tests would depend on each other. In order to be able to isolate tests you should be able to provide the User as the tests expects.

Specifying Readonly access for Django.db connection object

I have a series of integration-level tests that are being run as a management command in my Django project. These tests are verifying the integrity of a large amount of weather data ingested from external sources into my database. Because I have such a large amount of data, I really have to test against my production database for the tests to be meaningful. What I'm trying to figure out is how I can define a read-only database connection that is specific to that command or connection object. I should also add that these tests can't go through the ORM, so I need to execute raw SQL.
The structure of my test looks like this
class Command(BaseCommand):
help = 'Runs Integration Tests and Query Tests against Prod Database'
def handle(self,*args, **options):
suite = unittest.TestLoader().loadTestsFromTestCase(TestWeatherModel)
ret = unittest.TextTestRunner().run(suite)
if(len(ret.failures) != 0):
sys.exit(1)
else:
sys.exit(0)
class TestWeatherModel(unittest.TestCase):
def testCollectWeatherDataHist(self):
wm = WeatherManager()
wm.CollectWeatherData()
self.assertTrue(wm.weatherData is not None)
And the WeatherManager.CollectWeatherData() method would look like this:
def CollecWeatherData(self):
cur = connection.cursor()
cur.execute(<Raw SQL Query>)
wm.WeatherData = cur.fetchall()
cur.close()
I want to somehow idiot-proof this, so that someone else (or me) can't come along later and accidentally write a test that would modify the production database.

You can achieve this by hooking into Django's connection_created signal, and
then making the transaction read-only.
The following works for PostgreSQL:
from django.db.backends.signals import connection_created
class MyappConfig(AppConfig):
def ready(self):
def connection_created_handler(connection, **kwargs):
with connection.cursor() as cursor:
cursor.execute('SET default_transaction_read_only = true;')
connection_created.connect(connection_created_handler, weak=False)
This can be useful for some specific Django settings (e.g. to run development
code with runserver against the production DB), where you do not want to
create a real read-only DB user.

DATABASES = {
'default': {
'ENGINE': 'django.db.backends.postgresql_psycopg2',
'NAME': 'mydb',
'USER': 'myusername',
'PASSWORD': 'mypassword',
'HOST': 'myhost',
'OPTIONS': {
'options': '-c default_transaction_read_only=on'
}
}
}
Source: https://nejc.saje.info/django-postgresql-readonly.html

Man, once again, I should read the docs more carefully before I post questions here. I can define a readonly connection to my production database in the settings file, and then straight from the docs:
If you are using more than one database, you can use django.db.connections to obtain the connection (and cursor) for a specific database. django.db.connections is a dictionary-like object that allows you to retrieve a specific connection using its alias:
from django.db import connections
cursor = connections['my_db_alias'].cursor()
# Your code here...

If you add a serializer for you model, you could specialized in the serializer that is working in readonly mode
class AccountSerializer(serializers.ModelSerializer):
class Meta:
model = Account
fields = ('id', 'account_name', 'users', 'created')
read_only_fields = ('account_name',)
from http://www.django-rest-framework.org/api-guide/serializers/#specifying-read-only-fields

Django test database not being used with custom test runner

I am using Django 1.9 and have written a custom test runner. I cannot figure out why my production database is being used instead of my test_ database. I've read quite a lot of the Django docs to no avail. The name of my production database is zippymeals, so my test database should be test_zippymeals. But my tests seems to be running against the zippymeals database, not test_zippymeals. Here is the code for my custom test runner:
class TestRunner(DiscoverRunner):
def __init__(self, pattern=None, top_level=None, verbosity=1, interactive=True, failfast=False, **kwargs):
super(TestRunner, self).__init__(pattern, top_level, verbosity, interactive, failfast, **kwargs)
def setup_databases(self, **kwargs):
bash_cmd = "createdb -T zippymeals_template test_zippymeals"
process = subprocess.Popen(bash_cmd.split(), stdout=subprocess.PIPE)
process.communicate()[0]
def teardown_databases(self, old_config, **kwargs):
bash_cmd = "dropdb test_zippymeals"
process = subprocess.Popen(bash_cmd.split(), stdout=subprocess.PIPE)
process.communicate()[0]
and I set the following in my settings.py file:
TEST_RUNNER = 'tests.test_runner.TestRunner'
I'm using PostgreSQL, so my custom test runner uses the zippymeals_template database (which is an empty structure of my database) to create an empty test_zippymeals database. The test_zippymeals database gets created fine, it's just not being used when I run my tests. Also, I've tried doing the following with no luck:
DATABASES = {
'default': {
'ENGINE': 'django.db.backends.postgresql_psycopg2',
'NAME': 'zippymeals',
'TEST': {
'NAME': 'test_zippymeals'
}
}
}
Does anyone know how to inherit from DiscoverRunner and ensure that the test_ database is being used?

You need to call super class method for setup_databases which is the method responsible for setting the proper alias for the test database. E.g.
def setup_databases(self, **kwargs):
db = super(TestRunner, self).setup_database(**kwargs)
.... # your code goes here
return db
Here is a more generic version of setup_database method for running tests on postgres databases:
class CustomTestRunner(DiscoverRunner):
def setup_databases(self, **kwargs):
# your code goes here
bash_cmd = "createdb -T zippymeals_template test_zippymeals"
process = subprocess.Popen(bash_cmd.split(), stdout=subprocess.PIPE)
process.communicate()[0]
# force Django to connect to the correct db for tests
connection = db.connections['default']
db_conf = connection.settings_dict
db_conf['NAME'] = db_conf['TEST']['NAME']
connection.connect()

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.