How to run Django's test database only in memory?

How to run Django's test database only in memory? - python

My Django unit tests take a long time to run, so I'm looking for ways to speed that up. I'm considering installing an SSD, but I know that has its downsides too. Of course, there are things I could do with my code, but I'm looking for a structural fix. Even running a single test is slow since the database needs to be rebuilt / south migrated every time. So here's my idea...
Since I know the test database will always be quite small, why can't I just configure the system to always keep the entire test database in RAM? Never touch the disk at all. How do I configure this in Django? I'd prefer to keep using MySQL since that's what I use in production, but if SQLite 3 or something else makes this easy, I'd go that way.
Does SQLite or MySQL have an option to run entirely in memory? It should be possible to configure a RAM disk and then configure the test database to store its data there, but I'm not sure how to tell Django / MySQL to use a different data directory for a certain database, especially since it keeps getting erased and recreated each run. (I'm on a Mac FWIW.)

If you set your database engine to sqlite3 when you run your tests, Django will use a in-memory database.
I'm using code like this in my settings.py to set the engine to sqlite when running my tests:
if 'test' in sys.argv:
DATABASE_ENGINE = 'sqlite3'
Or in Django 1.2:
if 'test' in sys.argv:
DATABASES['default'] = {'ENGINE': 'sqlite3'}
And finally in Django 1.3 and 1.4:
if 'test' in sys.argv:
DATABASES['default'] = {'ENGINE': 'django.db.backends.sqlite3'}
(The full path to the backend isn't strictly necessary with Django 1.3, but makes the setting forward compatible.)
You can also add the following line, in case you are having problems with South migrations:
SOUTH_TESTS_MIGRATE = False

I usually create a separate settings file for tests and use it in test command e.g.
python manage.py test --settings=mysite.test_settings myapp
It has two benefits:
You don't have to check for test or any such magic word in sys.argv, test_settings.py can simply be
from settings import *
# make tests faster
SOUTH_TESTS_MIGRATE = False
DATABASES['default'] = {'ENGINE': 'django.db.backends.sqlite3'}
Or you can further tweak it for your needs, cleanly separating test settings from production settings.
Another benefit is that you can run test with production database engine instead of sqlite3 avoiding subtle bugs, so while developing use
python manage.py test --settings=mysite.test_settings myapp
and before committing code run once
python manage.py test myapp
just to be sure that all test are really passing.

MySQL supports a storage engine called "MEMORY", which you can configure in your database config (settings.py) as such:
'USER': 'root', # Not used with sqlite3.
'PASSWORD': '', # Not used with sqlite3.
'OPTIONS': {
"init_command": "SET storage_engine=MEMORY",
}
Note that the MEMORY storage engine doesn't support blob / text columns, so if you're using django.db.models.TextField this won't work for you.

I can't answer your main question, but there are a couple of things that you can do to speed things up.
Firstly, make sure that your MySQL database is set up to use InnoDB. Then it can use transactions to rollback the state of the db before each test, which in my experience has led to a massive speed-up. You can pass a database init command in your settings.py (Django 1.2 syntax):
DATABASES = {
'default': {
'ENGINE':'django.db.backends.mysql',
'HOST':'localhost',
'NAME':'mydb',
'USER':'whoever',
'PASSWORD':'whatever',
'OPTIONS':{"init_command": "SET storage_engine=INNODB" }
}
}
Secondly, you don't need to run the South migrations each time. Set SOUTH_TESTS_MIGRATE = False in your settings.py and the database will be created with plain syncdb, which will be much quicker than running through all the historic migrations.

You can do double tweaking:
use transactional tables: initial fixtures state will be set using database rollback after every TestCase.
put your database data dir on ramdisk: you will gain much as far as database creation is concerned and also running test will be faster.
I'm using both tricks and I'm quite happy.
How to set up it for MySQL on Ubuntu:
$ sudo service mysql stop
$ sudo cp -pRL /var/lib/mysql /dev/shm/mysql
$ vim /etc/mysql/my.cnf
# datadir = /dev/shm/mysql
$ sudo service mysql start
Beware, it's just for testing, after reboot your database from memory is lost!

Another approach: have another instance of MySQL running in a tempfs that uses a RAM Disk. Instructions in this blog post: Speeding up MySQL for testing in Django.
Advantages:
You use the exactly same database that your production server uses
no need to change your default mysql configuration

Extending on Anurag's answer I simplified the process by creating the same test_settings and adding the following to manage.py
if len(sys.argv) > 1 and sys.argv[1] == "test":
os.environ.setdefault("DJANGO_SETTINGS_MODULE", "mysite.test_settings")
else:
os.environ.setdefault("DJANGO_SETTINGS_MODULE", "mysite.settings")
seems cleaner since sys is already imported and manage.py is only used via command line, so no need to clutter up settings

Use below in your setting.py
DATABASES['default']['ENGINE'] = 'django.db.backends.sqlite3'

Related

Database Management on Django and Github

I am trying to set up a website using the Django Framework. Because of it's convenience, I had choosen SQLite as my database since the start of my project. It's very easy to use and I was very happy with this solution.
Being a new developer, I am quite new to Github and database management. Since SQLite databases are located in a single file, I was able to push my updates on Github until that .db file reached a critical size larger than 100MB. Since then, it seems my file is too large to push on my repository (for others having the same problem I found satisfying answers here: GIT: Unable to delete file from repo).
Because of this problem, I am now considering an alternative solution:
Since my website will require users too interact with my database (they are expected post a certain amount data), I am thinking about switching SQLite for MySQL. I was told MySQL will handle better the user inputs and will scale more easily (I dare to expect a large volume of users). This is the first part of my question. Is switching to MySQL after having used SQLite for a while a good idea/good practice or will it lead to migration problems?
If the answer to that first question is yes, then I have other questions about how to handle this change. Since SQLite is serverless, I will have to set up a new server for MySQL. Will I be able to access my data remotely with that server? Since I used to push my database on my Github repository, this is where I use to get my data from when I wanted to work remotely. Will there be a way for me to host my data on a server (hopefully for free) and fetch it the same way I fetch my code on Github?
Thank you very much for your help and I hope you have a nice day.

First of all, you shouldn't be uploading any sensitive data to your repository. That includes database passwords, Django's secret key or the database itself in the case of SQLite.
Answering your first question, there shouldn't be any problem switching from SQLite to MySQL. Django handles migrations exceptionally and SQLite has less features than MySQL. To migrate your data to a mysql database you can use django's dumpdata and loaddata.
Now, your second question is a bit more complicated. You can always expose your database to the Internet, but that is usually not a good idea unless you know exactly what you're doing and know how to secure it properly. If you go this way though, you can just change the database parameters in your settings file to point to your MySQL database's public IP and add the db name, user and password.
My recommendation though is to have one database for development in your dev PC and another in your production server that is behind a firewall and can only be accessed through localhost. I don't think you need the db in your dev pc to be always up to date, if you have some sample data that should be enough.
So, instead of writing sensitive data into the settings file you can have a secrets.json file in the root of your project that looks like this:
{
"secret_key": "YOURSUPERSECRETKEY",
"debug": true, TRUE IN YOUR DEV PC, FALSE IN YOUR PROD SERVER
"allowed_hosts": ["127.0.0.1" , "localhost", "YOUR"],
"db_name": "YOURDBNAME",
"db_user": "YOURDBUSER",
"db_password": "YOURDBPASSWORD",
"db_host": "localhost",
"db_port": 3306
}
This file should be included in your .gitignore so it doesn't get pushed to your repository and you would have one in your local pc and another one with different settings in your production server (you can use vi or nano to create the file).
Then in your settings.py file you can do the following:
import json
BASE_DIR = os.path.dirname(os.path.dirname(os.path.abspath(__file__)))
try:
with open(os.path.join(BASE_DIR, 'secrets.json')) as handle:
SECRETS = json.load(handle)
except IOError:
SECRETS = {}
SECRET_KEY = SECRETS['secret_key']
ALLOWED_HOSTS = SECRETS['allowed_hosts']
DEBUG = SECRETS['debug']
...
DATABASES = {
'default': {
'ENGINE': 'django.db.backends.mysql',
'NAME': SECRETS['db_name'],
'USER': SECRETS['db_user'],
'PASSWORD': SECRETS['db_password'],
'HOST': SECRETS['db_host'],
'PORT': SECRETS['db_port'],
}
}

Django code to execute only in development and production

I need to execute some housekeeping code but only in development or production environment. Unfortunately all management commands execute similar to runserver. Is there any clean way to classify what is the execution environment and run the code selectively.
I saw some solutions like 'runserver' in sys.argv
but it does not work for production. And does not look very clean.
Does django provide anything to classify all these different scenarios code is executing at?
Edit
The real problem is we need to initialise our local cache once the apps are loaded with some data that are frequently accessed. In general I want to fetch DB for some specific information and cache it (currently in memory). The issue is, when it tries to fetch DB, the table may not be created, in fact there may not be migration files created at all. So, when I run makemigrations/migrate, it will run this code which tries to fetch from DB, and throw error saying table does not exist. But if I can't run makemigration/migrate, there will be no table, it is kind of a loop I'm trying to avoid. The part of code will run for all management commands, but I would like to restrict it's execution only to when the app is actually serving requests (that is when the cache is needed) and not for any management commands (including the user defined ones).
```python
from django.apps import AppConfig
from my_app.signals import app_created
class MyAppConfig(AppConfig):
name = 'my_app'
def ready(self):
import my_app.signals
# Code below should be executed only in actual app execution
# And not in makemigration/migrate etc management commands
app_created.send(sender=MyAppConfig, sent_by="MyApp")
```
Q) Send app created signal for app execution other than executions due to management commands like makemigrations, migrate, etc.

There are so many different ways to do this. But generally when I create a production (or staging, or development) server I set an environment variable. And dynamically decide which settings file to load based on that environment variable.
Imagine something like this in a Django settings file:
import os
ENVIRONMENT = os.environ.get('ENVIRONMENT', 'development')
Then you can use
from django.conf import settings
if settings.ENVIRONMENT == 'production':
# do something only on production

Since, I did not get an convincing answer and I managed to pull off a solution, although not a 100% clean. I thought I would just share solution I ended up with.
import sys
from django.conf import settings
if (settings.DEBUG and 'runserver' in sys.argv) or not settings.DEBUG:
"""your code to run only in development and production"""
The rationale is you run the code if it is not in DEBUG mode no matter what. But if it is in DEBUG mode check if the process execution had runserver in the arguments.

How can i stop django to recreate test database

I am new to django testing and i have just one simple print hello line in djnago test
class SimpleTest(TestCase):
def setUp(self):
self.kid = mommy.make(User)
def test_details(self):
print self.kid
self.assertEqual(200, 200)
I run the test with this command
./manage.py test tests/myapp/
It really takes 3 minutes to run that test. djnago first says creating database and waits for 3 minutes to show result.
If chnage one word in test then again i have to wait for 3 minutes. Its very annoying.
I think it may be beacuse its recreating database everytime with many migrations.
Is there any way to make it fast or stop recreating database every time.
I am using django 1.7
The latest dev version has command --keepdb but its not in 1.7

If you are just testing for development purposes, I'd recommend setting your testing database to use sqlite3, which will leverage an in-memory database by default and should speed up tests.
I usually put this into local_settings which only gets executed for my dev environment...
if 'test' in sys.argv:
DATABASES['default'] = {'ENGINE': 'django.db.backends.sqlite3'}
If you are planning to push a release to a production environment, you'll want to test against the engine which is serving your production database (MySQL, PostgreSQL, etc).

Test database accessibility in Django

My question is very similar to this question
I'm just getting started with Django, and I find myself attempting to learn how it works any time I have a spare moment and my laptop available. I've found that Heroku is a pretty great place to test things, but I can't always reach the internet if I'm waiting to pick up kids, or something similar. In development, I would like to create a test that will check if a DB is accessible. If not, fail over to an SQLite DB.
I started with code heavily borrowed from here:
def pingable(hostname):
try:
return os.system("ping -c 1 " + hostname + " > /dev/null 2>&1") == 0
except:
return False
if (not pingable(DATABASES['default']['HOST'])):
DATABASES = {
'default': {
'ENGINE': 'django.db.backends.sqlite3',
'NAME': os.path.join(BASE_DIR, 'db.sqlite3'),
}
}
I simply plop that code immediately after the DATABASES variable is set. But this has a few weaknesses. The most glaring is that AWS (Which Heroku uses) doesn't respond to pings unless you specifically enable them . . . and honestly, why make things less secure if you don't have to?
So in the interest of not reinventing the wheel, this has led me to ask this question: has someone created a way to check if a Django DB is accessible?
I really only need to check Postgres . . . but I'd really love to find a generic solution, so half credit if you can point me to a solution that only works for Postgres.
Edit: To clarify, the internet itself may be available, but the necessary port(s) may be blocked by a firewall . . . it's hard to know what will be available

This isn't really the way to manage database settings between Heroku and your local dev machine.
Heroku manages all these sorts of settings via environment variables, which is one of the principles of the 12-factor app. They've also made a Django library, dj-database-url, which reads those env vars and automatically configures the settings appropriately.
You should use this for your database settings, and then you can set a local env var DATABASE_URL with the address of your local sqlite3 database. Then your app will automatically run in both dev and production and configure itself to point to the relevant database automatically.

Django Test Error "Permission denied to create database" - Using Heroku Postgres

I am trying to do do a simple Django test using a tests.py file in a Django App directory (mysite/polls/tests.py), but every time I run 'python manage.py test polls', I get the error:
C:\Python27\python.exe "C:\Program Files (x86)\JetBrains\PyCharm 3.0\helpers\pycharm\django_test_manage.py" test polls "C:\Users\<myname>\PycharmProjects\mysite"
Testing started at 8:40 PM ...
Creating test database for alias 'default'...
Type 'yes' if you would like to try deleting the test database 'test_<database>', or 'no' to cancel: Got an error creating the test database: permission denied to create database
From what I've read, apparently Heroku PG uses a shared database, so I do not have the permission to create/destroy databases, which is necessary for testing. Is there an obvious solution to this? I am still developing on my local drive, so any workarounds would be appreciated. I know that testing is an important part of programming, so I would like to be able to implement a testing method as soon as possible.
I am trying to test using the TestCase django class.
What I am using:
1) Heroku Postgres Hobby Dev Plan
2) Postgres 9.3.3
3) Python 2.7.6
4) Django 1.6.1
EDIT:
So after doing a bit more research, I found out that I can override my DATABASES dict variable in settings.py to use SQLite to test locally (when 'test' is an argument in shell), but I would still prefer a PostgreSQL implementation, since from what I read, PostgreSQL is more strict (which I am a fan of).
For anyone interested in the semi-solution I have found (courtesy of another member of Stackoverflow):
if 'test' in sys.argv:
DATABASES['default'] = {'ENGINE': 'django.db.backends.sqlite3'}
Don't forget to import sys.

There's no reason for you to need to create a database with your tests. Instead, change your tests to access the local database, if a DATABASE_URL environment variable is undefined. Here's what I do in Node.js, where I have local test and dev databases, and a Heroku-provided production db:
if (typeof(process.env.DATABASE_URL) !== 'undefined') {
dbUrl = url.parse(process.env.DATABASE_URL);
}
else if (process.env.NODE_ENV === 'test') {
dbUrl = url.parse('tcp://postgres:postgres#127.0.0.1:5432/test');
}
else {
dbUrl = url.parse('tcp://postgres:postgres#127.0.0.1:5432/db');
}

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.