How to connect Celery worker to django test database

How to connect Celery worker to django test database - python

I could like get my celery worker process to talk to the django test database.
Its an oracle database, so I believe the database/user is already created.
I am just trying to figure out what to pass the Celery/App configuration to get it to talk to the "TEST" database.
DATABASES = {
'default': {
'ENGINE': 'django.db.backends.oracle',
.............
'TEST': {
'USER': WIT_TEST_DB_USER,
'PASSWORD': WIT_TEST_DB_USER,
}
}
}
I have seen a stackoverflow article that talks about passing the settings.conf from the parent test setup() to the worker process. That may be necessary when the test database file is automatically generated in case of sqllite databases.
In my case, its a well defined oracle test database that I think is already part of the config/settings files.
so I am looking for a way to directly start the work process independent of the testrunner/testcase code.
Can some one suggest an approach to doing this?

You are regarding your test database as an ordinary database. So I think the best solution would be to define your test database as the default database under DATABASES settings in a separate settings file. And when running your worker you can pass the specific new settings file to your worker like this:
export DJANGO_SETTINGS_MODULE='[python path to your celery specific settings file]'
# the command to run your celery worker

Related

new separate django project on database which already hosts another django project

I need to develop a new django project (let's call it new_django) using a SQL Server 2019 database named AppsDB which already hosts another django project (let's call it old_django). The two apps are completely separate from each other. Unfortunately, I can't get a new database for each new django project, so I have to reuse AppsDB. What I don't understand is, how can I tell django not to overwrite the existing auth_... and django_... tables generated by old_django?
My first idea was to use different schemas for the two project, but django doesn't support this with a SQL Server database as far as I know. Some workarounds suggest to change the database default schema for a given user like this anwser. But I won't get a new user for every project either. And relying on manually changing the db schema every time before I migrate something will most certainly cause a mess at some point.
I'm stuck with the current setup and would like to know if anyone has come up with a more elegant solution or different approach to solve my problem?
Any help is much appreciated!

All you need to do is to create a new database in mssql server and then point your django application on the database server like this below.
DATABASES = {
'default': {
'ENGINE': 'mssql',
'NAME': 'YOU_DATABASE_NAME',
'USER': 'DB_USER',
'PASSWORD': 'DB_PASSWORD',
'HOST': 'YOUR_DATABASE_HOST',
'PORT': '',
'OPTIONS': {
'driver': 'ODBC Driver 13 for SQL Server',
},
}
}

Database Management on Django and Github

I am trying to set up a website using the Django Framework. Because of it's convenience, I had choosen SQLite as my database since the start of my project. It's very easy to use and I was very happy with this solution.
Being a new developer, I am quite new to Github and database management. Since SQLite databases are located in a single file, I was able to push my updates on Github until that .db file reached a critical size larger than 100MB. Since then, it seems my file is too large to push on my repository (for others having the same problem I found satisfying answers here: GIT: Unable to delete file from repo).
Because of this problem, I am now considering an alternative solution:
Since my website will require users too interact with my database (they are expected post a certain amount data), I am thinking about switching SQLite for MySQL. I was told MySQL will handle better the user inputs and will scale more easily (I dare to expect a large volume of users). This is the first part of my question. Is switching to MySQL after having used SQLite for a while a good idea/good practice or will it lead to migration problems?
If the answer to that first question is yes, then I have other questions about how to handle this change. Since SQLite is serverless, I will have to set up a new server for MySQL. Will I be able to access my data remotely with that server? Since I used to push my database on my Github repository, this is where I use to get my data from when I wanted to work remotely. Will there be a way for me to host my data on a server (hopefully for free) and fetch it the same way I fetch my code on Github?
Thank you very much for your help and I hope you have a nice day.

First of all, you shouldn't be uploading any sensitive data to your repository. That includes database passwords, Django's secret key or the database itself in the case of SQLite.
Answering your first question, there shouldn't be any problem switching from SQLite to MySQL. Django handles migrations exceptionally and SQLite has less features than MySQL. To migrate your data to a mysql database you can use django's dumpdata and loaddata.
Now, your second question is a bit more complicated. You can always expose your database to the Internet, but that is usually not a good idea unless you know exactly what you're doing and know how to secure it properly. If you go this way though, you can just change the database parameters in your settings file to point to your MySQL database's public IP and add the db name, user and password.
My recommendation though is to have one database for development in your dev PC and another in your production server that is behind a firewall and can only be accessed through localhost. I don't think you need the db in your dev pc to be always up to date, if you have some sample data that should be enough.
So, instead of writing sensitive data into the settings file you can have a secrets.json file in the root of your project that looks like this:
{
"secret_key": "YOURSUPERSECRETKEY",
"debug": true, TRUE IN YOUR DEV PC, FALSE IN YOUR PROD SERVER
"allowed_hosts": ["127.0.0.1" , "localhost", "YOUR"],
"db_name": "YOURDBNAME",
"db_user": "YOURDBUSER",
"db_password": "YOURDBPASSWORD",
"db_host": "localhost",
"db_port": 3306
}
This file should be included in your .gitignore so it doesn't get pushed to your repository and you would have one in your local pc and another one with different settings in your production server (you can use vi or nano to create the file).
Then in your settings.py file you can do the following:
import json
BASE_DIR = os.path.dirname(os.path.dirname(os.path.abspath(__file__)))
try:
with open(os.path.join(BASE_DIR, 'secrets.json')) as handle:
SECRETS = json.load(handle)
except IOError:
SECRETS = {}
SECRET_KEY = SECRETS['secret_key']
ALLOWED_HOSTS = SECRETS['allowed_hosts']
DEBUG = SECRETS['debug']
...
DATABASES = {
'default': {
'ENGINE': 'django.db.backends.mysql',
'NAME': SECRETS['db_name'],
'USER': SECRETS['db_user'],
'PASSWORD': SECRETS['db_password'],
'HOST': SECRETS['db_host'],
'PORT': SECRETS['db_port'],
}
}

How to handle MySQL failover within celery

I'm trying to implement a failover strategy when my MySQL backend is down in Celery.
I found in this other stack overflow answer that failover is made possible in SQLAlchemy. However, I couldn't write the same behavior in Celery using sqlalchmey_engine_options
__app.conf.result_backend = 'db+mysql://scott:tiger#localhost/foo'
__app.conf.sqlalchmey_engine_options = {
'connect_args': {
'failover': [{
'user': 'root',
'password': 'password',
'host': 'http://other_db.com',
'database': 'dbname'
}]
}
}
What I'm trying to do is if the first backend scott:tiger does not respond, then it switches to root:password backend.

There is definitely more than one way to achieve failover. You could start with simple try..except and handle situation when your prefered backend is not responding, in simplest (and probably not very pythonic) way you could try something like this:
try:
# initialise your SQL here and also set the connection up
except:
# initialise your backup SQL here
You could also move your backend selection to the infrastructure so it's transparent from your application perspective, i.e. by using session pooling system (I am not MySQL user but in PostgreSQL world we have pgpool).
--- edit ---
I realised you probably want to have your database session and connection handled by celery itself. So very likely above does not answer your question directly, in my simple project I initialise database connection within tasks that require it as in my particular case most tasks do not require database at all.

Django sqlite3 timeout has no effect

I have a simple integration test in Django that spawns a single Celery worker to run a job, which writes a record to the database. The Django thread also writes a record to the database. Because it's a test, I use the default in-memory sqlite3 database. There are no transactions being used.
I often get this error:
django.db.utils.OperationalError: database table is locked
which according to the Django docs is due to one connection timing out while waiting for another to finish. It's "more concurrency than sqlite can handle in default configuration". This seems strange given that it's two records in two threads. Nevertheless, the same docs say to increase the timeout option to force connections to wait longer. Ok, I change my database settings to this:
DATABASES = {
'default': {
'ENGINE': 'django.db.backends.sqlite3',
'NAME': os.path.join(BASE_DIR, 'db.sqlite3'),
'OPTIONS': {'timeout': 10000000},
}
}
This has no effect. The error still appears and it clearly has not waited 1e7 seconds or 1e7 milliseconds or 1e7 microseconds before doing so. Is there an additional setting I'm missing?
I have tried both Python 3.5 and Python 3.6 and both Django 1.11 and Django 2.0.

I had the same issue and my experiments gave me the following:
I've figured out that Django uses in-memory SQLite DB in the test mode until you explicitly change this. That explains why I only see that problem in my unit tests. To force Django to use SQLite DB in the file set DATABASES->TEST->NAME explicitly in your settings.py. For example like this:
DATABASES = {
'default': {
...
'TEST': {
'NAME': 'testdb.sqlite3',
},
},
}
Setting timeout value larger than 2147483.647 (looks familiar, right? :-) ) disables timeout (or sets it to negligibly small value).
As far as I understand, the root of the problem is that when SQLite uses the shared cache the timeout value is not respected at all.

How to run Django's test database only in memory?

My Django unit tests take a long time to run, so I'm looking for ways to speed that up. I'm considering installing an SSD, but I know that has its downsides too. Of course, there are things I could do with my code, but I'm looking for a structural fix. Even running a single test is slow since the database needs to be rebuilt / south migrated every time. So here's my idea...
Since I know the test database will always be quite small, why can't I just configure the system to always keep the entire test database in RAM? Never touch the disk at all. How do I configure this in Django? I'd prefer to keep using MySQL since that's what I use in production, but if SQLite 3 or something else makes this easy, I'd go that way.
Does SQLite or MySQL have an option to run entirely in memory? It should be possible to configure a RAM disk and then configure the test database to store its data there, but I'm not sure how to tell Django / MySQL to use a different data directory for a certain database, especially since it keeps getting erased and recreated each run. (I'm on a Mac FWIW.)

If you set your database engine to sqlite3 when you run your tests, Django will use a in-memory database.
I'm using code like this in my settings.py to set the engine to sqlite when running my tests:
if 'test' in sys.argv:
DATABASE_ENGINE = 'sqlite3'
Or in Django 1.2:
if 'test' in sys.argv:
DATABASES['default'] = {'ENGINE': 'sqlite3'}
And finally in Django 1.3 and 1.4:
if 'test' in sys.argv:
DATABASES['default'] = {'ENGINE': 'django.db.backends.sqlite3'}
(The full path to the backend isn't strictly necessary with Django 1.3, but makes the setting forward compatible.)
You can also add the following line, in case you are having problems with South migrations:
SOUTH_TESTS_MIGRATE = False

I usually create a separate settings file for tests and use it in test command e.g.
python manage.py test --settings=mysite.test_settings myapp
It has two benefits:
You don't have to check for test or any such magic word in sys.argv, test_settings.py can simply be
from settings import *
# make tests faster
SOUTH_TESTS_MIGRATE = False
DATABASES['default'] = {'ENGINE': 'django.db.backends.sqlite3'}
Or you can further tweak it for your needs, cleanly separating test settings from production settings.
Another benefit is that you can run test with production database engine instead of sqlite3 avoiding subtle bugs, so while developing use
python manage.py test --settings=mysite.test_settings myapp
and before committing code run once
python manage.py test myapp
just to be sure that all test are really passing.

MySQL supports a storage engine called "MEMORY", which you can configure in your database config (settings.py) as such:
'USER': 'root', # Not used with sqlite3.
'PASSWORD': '', # Not used with sqlite3.
'OPTIONS': {
"init_command": "SET storage_engine=MEMORY",
}
Note that the MEMORY storage engine doesn't support blob / text columns, so if you're using django.db.models.TextField this won't work for you.

I can't answer your main question, but there are a couple of things that you can do to speed things up.
Firstly, make sure that your MySQL database is set up to use InnoDB. Then it can use transactions to rollback the state of the db before each test, which in my experience has led to a massive speed-up. You can pass a database init command in your settings.py (Django 1.2 syntax):
DATABASES = {
'default': {
'ENGINE':'django.db.backends.mysql',
'HOST':'localhost',
'NAME':'mydb',
'USER':'whoever',
'PASSWORD':'whatever',
'OPTIONS':{"init_command": "SET storage_engine=INNODB" }
}
}
Secondly, you don't need to run the South migrations each time. Set SOUTH_TESTS_MIGRATE = False in your settings.py and the database will be created with plain syncdb, which will be much quicker than running through all the historic migrations.

You can do double tweaking:
use transactional tables: initial fixtures state will be set using database rollback after every TestCase.
put your database data dir on ramdisk: you will gain much as far as database creation is concerned and also running test will be faster.
I'm using both tricks and I'm quite happy.
How to set up it for MySQL on Ubuntu:
$ sudo service mysql stop
$ sudo cp -pRL /var/lib/mysql /dev/shm/mysql
$ vim /etc/mysql/my.cnf
# datadir = /dev/shm/mysql
$ sudo service mysql start
Beware, it's just for testing, after reboot your database from memory is lost!

Another approach: have another instance of MySQL running in a tempfs that uses a RAM Disk. Instructions in this blog post: Speeding up MySQL for testing in Django.
Advantages:
You use the exactly same database that your production server uses
no need to change your default mysql configuration

Extending on Anurag's answer I simplified the process by creating the same test_settings and adding the following to manage.py
if len(sys.argv) > 1 and sys.argv[1] == "test":
os.environ.setdefault("DJANGO_SETTINGS_MODULE", "mysite.test_settings")
else:
os.environ.setdefault("DJANGO_SETTINGS_MODULE", "mysite.settings")
seems cleaner since sys is already imported and manage.py is only used via command line, so no need to clutter up settings

Use below in your setting.py
DATABASES['default']['ENGINE'] = 'django.db.backends.sqlite3'

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.