DB Permissions with Django unit testing

DB Permissions with Django unit testing - python

Disclaimer:
I'm very new to Django. I must say that so far I really like it. :)
(now for the "but"...)
But, there seems to be something I'm missing related to unit testing. I'm working on a new project with an Oracle backend. When you run the unit tests, it immediately gives a permissions error when trying to create the schema. So, I get what it's trying to do (create a clean sandbox), but what I really want is to test against an existing schema. And I want to run the test with the same username/password that my server is going to use in production. And of course, that user is NOT going to have any kind of DDL type rights.
So, the basic problem/issue that I see boils down to this: my system (and most) want to have their "app_user" account to have ONLY the permissions needed to run. Usually, this is basic "CRUD" permissions. However, Django unit tests seem to need more than this to do a test run.
How do other people handle this? Is there some settings/work around/feature of Django that I'm not aware (please refer to the initial disclaimer).
Thanks in advance for your help.
David

Don't force Django to do something unnatural.
Allow it to create the test schema. It's a good thing.
From your existing schema, do an unload to create .JSON dump files of the data. These files are your "fixtures". These fixtures are used by Django to populate the test database. This is The Greatest Testing Tool Ever. Once you get your fixtures squared away, this really does work well.
Put your fixture files into fixtures directories within each app package.
Update your unit tests to name the various fixtures files that are required for that test case.
This -- in effect -- tests with an existing schema. It rebuilds, reloads and tests in a virgin database so you can be absolutely sure that it works without destroying (or even touching) live data.

As you've discovered, Django's default test runner makes quite a few assumptions, including that it'll be able to create a new test database to run the tests against.
If you need to override this or any of these default assumptions, you probably want to write a custom test runner. By doing so you'll have full control over exactly how tests are discovered, bootstrapped, and run.
(If you're running Django's development trunk, or are looking forward to Django 1.2, note that defining custom test runners has recently gotten quite a bit easier.)
If you poke around, you'll find a few examples of custom test runners you could use to get started.
Now, keep in mind that once you've taken control of test running you'll need to ensure that you someone meet the same assumptions about environment that Django's built-in runner does. In particular, you'll need to someone guarantee that whatever test database you'll use is a clean, fresh one for the tests -- you'll be quite unhappy if you try to run tests against a database with unpredictable contents.

After I read David's (OP) question, I was curious about this too, but I don't see the answer I was hoping to see. So let me try to rephrase what I think at least part of what David is asking. In a production environment, I'm sure his Django models probably will not have access to create or drop tables. His DBA will probably not allow him to have permission to do this. (Let's assume this is True). He will only be logged into the database with regular user privileges. But in his development environment, the Django unittest framework forces him to have higher level privileges for the unittests instead of a regular user because Django requires it to create/drop tables for the model unittests. Since the unittests are now running at a higher privilege than will occur in production, you could argue that running the unittests in development are not 100% valid and errors could happen in production that might have been caught in development if Django could run the unittests with user privileges.
I'm curious if Django unittests will ever have the ability to create/drop tables with one user's (higher) privileges, and run the unittests with a different user's (lower) privileges. This would help more accurately simulate the production environment in development.
Maybe in practice this is really not an issue. And the risk is so minor compared to the reward that it not worth worrying about it.

Generally speaking, when unit tests depend on test data to be present, they also depend on it to be in a specific format/state. As such, your framework's policy is to not only execute DML (delete/insert test data records), but it also executes DDL (drop/create tables) to ensure that everything is in working order prior to running your tests.
What I would suggest is that you grant the necessary privileges for DDL to your app_user ONLY on your test_ database.
If you don't like that solution, then have a look at this blog entry where a developer also ran into your scenario and solved it with a workaround:
http://www.stopfinder.com/blog/2008/07/26/flexible-test-database-engine-selection-in-django/
Personally, my choice would be to modify the privileges for the test database. This way, I could rule out all other variables when comparing performance/results between testing/production environments.
HTH,
-aj

What you can do, is creating separate test settings.
As I've learned at http://mindlesstechnology.wordpress.com/2008/08/16/faster-django-unit-tests/ you can use the sqlite3 backend, which is created in memory by the Django unit test framework.
Quoting:
Create a new test-settings.py file next to your app’s settings.py
containing:
from projectname.settings import * DATABASE_ENGINE = 'sqlite3'
Then when you want to run tests real fast, instead of manage.py test,
you run
manage.py test --settings=test-settings
This runs my test suite in less than 5 seconds.
Obviously you still want to run tests on your real db backend, but
this is awesome for sanity checks, and while you’re doing test
development.
To load initial data, provide fixtures in your testcase.
class MyAppTestCase(TestCase):
fixtures = ['myapp/fixtures/filename']

Related

Managing migration of large number of fixtures in Django for unit testing

I currently have one fixture for all my tests for my Django application. It is fine however updating units tests each time I add a new object to my fixture is tedious.
Objects count and equality of query set have to be updated.
Many "get" methods fail when duplicate appears for various reasons.
Having a dataset filled with every possible use case for each unit test seems like a bad practice.
So I would like to have a fixture for each component of the app to test, e.g. if I have a model class "MyModel", it has a dedicated TestCase all its functionalities have unit tests and I would like them to have a dedicated fixture. The main interest would be that I automatically solve all three points mentioned above.
However, it has some drawbacks
File management, I copy a directory into Django's data directory for my fixture, I would need to manage multiple data directories.
Redundance of some fixtures elements. Many elements rely on the existence of other elements up to the user object, each fixture would need a prefilled database with some common objects (e.g. configured user)
Django migrations.
The two first points are not the real problem but the task of migrating fixtures alongside my codebase looks like hell, It is already hard to manage for one fixture this post explains how to manage code, migrations, database state, and a fixture. But this seems like too much if I have to do it for every fixture of every test.
Is there some clean way out there to migrate fixtures as you migrate the database?
PS: If it matters, I am using Django 3.2 with Django-rest-framework

How can I create a unit test for SQL statements?

I have a couple of SQL statements stored as files which get executed by a Python script. The database is hosted in Snowflake and I use Snowflake SQLAlchemy to connect to it.
How can I test those statements? I don't want to execute them, I just want to check if they could be executable.
One very basic thing to check if it is valid standard SQL. A better answer would be something that considers snowflake-specific stuff like
copy into s3://example from table ...
The best answer would be something that also checks permissions, e.g. for SELECT statements if the table is visible / readable.

An in-memory sqlite database is one option. But if you are executing raw SQL queries against snowflake in your code, your tests may fail if the same syntax isn't valid against sqlite. Recording your HTTP requests against a test snowflake database, and then replaying them for your unit tests suits this purpose better. There are two very good libraries that do this, check them out:
vcrpy
betamax

We do run integration tests on our Snowflake databases. We maintain clones of our production databases, for example, one of our production databases is called data_lake and we maintain a clone that gets cloned on a nightly basis called data_lake_test which is used to run our integration tests against.
Like Tim Biegeleisen mentioned, a "true" unittest would mock the response but our integration tests do run real Snowflake queries on our test cloned databases. There is the possibility that a test drastically alters the test database, but we run integration tests only during our CI/CD process so it is rare if there is ever a conflict between two tests.

I very much like this idea, however I can suggest a work around, as I often have to check my syntax and need help there. What I would recommend, if you plan on using the Snowflake interface would be to make sure to use the LIMIT 10 or LIMIT 1 on the SELECT statements that you would be needing to validate.
Another tip I would recommend is talking to a Snowflake representative about a trial if you are just getting started. They will also have alot of tips for more specific queries you are seeking to validate.
And finally, based on some comments, make sure it uses SQL: ANSI and the live in the https://docs.snowflake.net/manuals/index.html for reference.

As far as the validity of the sql statement is a concern you can run explain of the statement and it should give you error if syntax is incorrect or if you do not have permission to access the object/database. That being there still some exceptions which you cannot run explain for like 'use' command which I do not think is needed for validation.
Hope this helps.

How to get Django unittest to commit/save data to the database

I'm debugging a big unittest test for django & would like to use my normal debugging tools to do it:
looking at the db in the django admin through runserver
looking in the db manually.
Neither work, because unittest hasn't committed the transaction it's running the db side of the test in.
The obvious solution seems to be to just tell unittest not to use a transaction, or get it to commit somehow. Another way would be to create a custom settings file which would let runserver connect to the transaction. But the first idea seems like it should be really easy. Any ideas? I'm using MySQL & django 1.3.1

Consider using TransactionTestCase as the parent class of your test cases rather than TestCase. TransactionTestCase doesn't use the transaction behavior of TestCase, so you can commit at the point where you need to inspect the database state.
Additionally, if your unit test is so big that you need to inspect its database state while it's running, you're probably doing it wrong. A unit test should test one thing and one thing only, and it should be fairly obvious what the state is at any point. See Carl Meyer's Pycon 2012 talk on testing in Django for some excellent advice on writing good unit tests.

Have different initial_data fixtures for different stages (testing v. production)

I have an initial_data fixture that I want to load everytime except for production. I already have different settings file for production and non-production deployments.
Any suggestions on how to accomplish this?
Clarification: I do not want test fixtures. Basically, I just need the fixture to be loaded based on a setting change of some sort. I'll be digging into the Django code to see if I could figure out an elegant way to accomplish this.

You can actually setup different test fixtures for each test if you want:
http://docs.djangoproject.com/en/dev/topics/testing/#topics-testing-fixtures
If you only want to load the fixtures in one time, you can also write a custom TestRunner that will allow you to do that setup at the beginning:
docs.djangoproject.com/en/dev/topics/testing/#using-different-testing-frameworks
Both of those will still load the data from the production fixtures as that is done with syncdb, but you can override the data, or even delete it all. This may not be optimal if you are loading large amounts of data into your production product. If this is the case, I would recommend you adding a custom command like load_production_data that allows you to do it quickly and easily from the command line.

The easiest way is to use manage.py testserver [fixture ...]
If this is a staging (rather than dev) deployment, though, you may not want to use django's builtin server. In that case, a quick (if hacky) way of doing what you're after is to have the fixtures in an app (called, for example, "undeployed") that is only installed in your non-production settings.

Django workflow when modifying models frequently?

as I usually don't do the up front design of my models in Django projects I end up modifying the models a lot and thus deleting my test database every time (because "syncdb" won't ever alter the tables automatically for you). Below lies my workflow and I'd like to hear about yours. Any thoughts welcome..
Modify the model.
Delete the test database. (always a simple sqlite database for me.)
Run "syncdb".
Generate some test data via code.
goto 1.
A secondary question regarding this.. In case your workflow is like above, how do you execute the 4. step? Do you generate the test data manually or is there a proper hook point in Django apps where you can inject the test-data-generating-code at server startup?\
TIA.

Steps 2 & 3 can be done in one step:
manage.py reset appname
Step 4 is most easily managed, from my understanding, by using fixtures

This is a job for Django's fixtures. They are convenient because they are database independent and the test harness (and manage.py) have built-in support for them.
To use them:
Set up your data in your app (call
it "foo") using the admin tool
Create a fixtures directory in your
"foo" app directory
Type: python manage.py dumpdata --indent=4 foo > foo/fixtures/foo.json
Now, after your syncdb stage, you just type:
python manage.py loaddata foo.json
And your data will be re-created.
If you want them in a test case:
class FooTests(TestCase):
fixtures = ['foo.json']
Note that you will have to recreate or manually update your fixtures if your schema changes drastically.
You can read more about fixtures in the django docs for Fixture Loading

Here's what we do.
Apps are named with a Schema version number. appa_2, appb_1, etc.
Minor changes don't change the number.
Major changes increment the number. Syncdb works. And a "data migration" script can be written.
def migrate_appa_2_to_3():
for a in appa_2.SomeThing.objects.all():
appa_3.AnotherThing.create( a.this, a.that )
appa_3.NewThing.create( a.another, a.yetAnother )
for b in ...
The point is that drop and recreate isn't always appropriate. It's sometimes helpful to move data form the old model to the new model without rebuilding from scratch.

South is the coolest.
Though good ol' reset works best when data doesn't matter.
http://south.aeracode.org/

To add to Matthew's response, I often also use custom SQL to provide initial data as documented here.
Django just looks for files in <app>/sql/<modelname>.sql and runs them after creating tables during syncdb or sqlreset. I use custom SQL when I need to do something like populate my Django tables from other non-Django database tables.

Personally my development db is for a project I'm working on right now is rather large, so I use dmigrations to create db migration scripts to modify the db (rather than wiping out the db everytime like I did in the beginning).
Edit: Actually, I'm using South now :-)

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.