Test database accessibility in Django - python

My question is very similar to this question
I'm just getting started with Django, and I find myself attempting to learn how it works any time I have a spare moment and my laptop available. I've found that Heroku is a pretty great place to test things, but I can't always reach the internet if I'm waiting to pick up kids, or something similar. In development, I would like to create a test that will check if a DB is accessible. If not, fail over to an SQLite DB.
I started with code heavily borrowed from here:
def pingable(hostname):
try:
return os.system("ping -c 1 " + hostname + " > /dev/null 2>&1") == 0
except:
return False
if (not pingable(DATABASES['default']['HOST'])):
DATABASES = {
'default': {
'ENGINE': 'django.db.backends.sqlite3',
'NAME': os.path.join(BASE_DIR, 'db.sqlite3'),
}
}
I simply plop that code immediately after the DATABASES variable is set. But this has a few weaknesses. The most glaring is that AWS (Which Heroku uses) doesn't respond to pings unless you specifically enable them . . . and honestly, why make things less secure if you don't have to?
So in the interest of not reinventing the wheel, this has led me to ask this question: has someone created a way to check if a Django DB is accessible?
I really only need to check Postgres . . . but I'd really love to find a generic solution, so half credit if you can point me to a solution that only works for Postgres.
Edit: To clarify, the internet itself may be available, but the necessary port(s) may be blocked by a firewall . . . it's hard to know what will be available

This isn't really the way to manage database settings between Heroku and your local dev machine.
Heroku manages all these sorts of settings via environment variables, which is one of the principles of the 12-factor app. They've also made a Django library, dj-database-url, which reads those env vars and automatically configures the settings appropriately.
You should use this for your database settings, and then you can set a local env var DATABASE_URL with the address of your local sqlite3 database. Then your app will automatically run in both dev and production and configure itself to point to the relevant database automatically.

Related

Database Management on Django and Github

I am trying to set up a website using the Django Framework. Because of it's convenience, I had choosen SQLite as my database since the start of my project. It's very easy to use and I was very happy with this solution.
Being a new developer, I am quite new to Github and database management. Since SQLite databases are located in a single file, I was able to push my updates on Github until that .db file reached a critical size larger than 100MB. Since then, it seems my file is too large to push on my repository (for others having the same problem I found satisfying answers here: GIT: Unable to delete file from repo).
Because of this problem, I am now considering an alternative solution:
Since my website will require users too interact with my database (they are expected post a certain amount data), I am thinking about switching SQLite for MySQL. I was told MySQL will handle better the user inputs and will scale more easily (I dare to expect a large volume of users). This is the first part of my question. Is switching to MySQL after having used SQLite for a while a good idea/good practice or will it lead to migration problems?
If the answer to that first question is yes, then I have other questions about how to handle this change. Since SQLite is serverless, I will have to set up a new server for MySQL. Will I be able to access my data remotely with that server? Since I used to push my database on my Github repository, this is where I use to get my data from when I wanted to work remotely. Will there be a way for me to host my data on a server (hopefully for free) and fetch it the same way I fetch my code on Github?
Thank you very much for your help and I hope you have a nice day.
First of all, you shouldn't be uploading any sensitive data to your repository. That includes database passwords, Django's secret key or the database itself in the case of SQLite.
Answering your first question, there shouldn't be any problem switching from SQLite to MySQL. Django handles migrations exceptionally and SQLite has less features than MySQL. To migrate your data to a mysql database you can use django's dumpdata and loaddata.
Now, your second question is a bit more complicated. You can always expose your database to the Internet, but that is usually not a good idea unless you know exactly what you're doing and know how to secure it properly. If you go this way though, you can just change the database parameters in your settings file to point to your MySQL database's public IP and add the db name, user and password.
My recommendation though is to have one database for development in your dev PC and another in your production server that is behind a firewall and can only be accessed through localhost. I don't think you need the db in your dev pc to be always up to date, if you have some sample data that should be enough.
So, instead of writing sensitive data into the settings file you can have a secrets.json file in the root of your project that looks like this:
{
"secret_key": "YOURSUPERSECRETKEY",
"debug": true, TRUE IN YOUR DEV PC, FALSE IN YOUR PROD SERVER
"allowed_hosts": ["127.0.0.1" , "localhost", "YOUR"],
"db_name": "YOURDBNAME",
"db_user": "YOURDBUSER",
"db_password": "YOURDBPASSWORD",
"db_host": "localhost",
"db_port": 3306
}
This file should be included in your .gitignore so it doesn't get pushed to your repository and you would have one in your local pc and another one with different settings in your production server (you can use vi or nano to create the file).
Then in your settings.py file you can do the following:
import json
BASE_DIR = os.path.dirname(os.path.dirname(os.path.abspath(__file__)))
try:
with open(os.path.join(BASE_DIR, 'secrets.json')) as handle:
SECRETS = json.load(handle)
except IOError:
SECRETS = {}
SECRET_KEY = SECRETS['secret_key']
ALLOWED_HOSTS = SECRETS['allowed_hosts']
DEBUG = SECRETS['debug']
...
DATABASES = {
'default': {
'ENGINE': 'django.db.backends.mysql',
'NAME': SECRETS['db_name'],
'USER': SECRETS['db_user'],
'PASSWORD': SECRETS['db_password'],
'HOST': SECRETS['db_host'],
'PORT': SECRETS['db_port'],
}
}

How do I upgrade an old Django Google App Engine application to a Second Generation Cloud SQL Instance?

I have a Django application that I wrote 5 years ago, which has been running successfully on Google App Engine - until last month when Google upgraded to Second Generation Cloud SQL.
Currently, I have a settings.py file, which includes a database definition which looks like this:
DATABASES = {
'default': {
'ENGINE': 'google.appengine.ext.django.backends.rdbms',
'INSTANCE': 'my-project-name:my-db-instance',
'NAME': 'my-db-name',
},
Google's upgrade guide, tells me that the connection name needs to change from 'my-project-name:my-db-instance' to 'my-project-name:my-region:my-db-instance'. That's simple enough. Changing this leads me to get the error
InternalError at /login/
(0, u'Not authorized to access instance:
my-project-name:my-region:my-db-instance')
According to this question, I need to add the prefix '/cloudsql/' to my instance name. So, I changed this (and the ENGINE specification) to give:
DATABASES = {
'default': {
'ENGINE': 'django.db.backends.mysql',
'INSTANCE': '/cloudsql/my-project-name:my-region:my-db-instance',
'NAME': 'my-db-name',
'USER' : 'root',
'PASSWORD': '******************',
},
I uploaded the modified file to Google (using gcloud app deploy). This time I get a completely different error screen, showing:
Error: Server Error
The server encountered an error and could not complete your request.
Please try again in 30 seconds.
When I look in the Logs, I see:
ImproperlyConfigured: Error loading MySQLdb module: No module named
MySQLdb
It was pointed out by Daniel Ocando in this question, "The rdbms library will not work with an upgraded Second Generation Cloud SQL instance". Connections to the database now need to be made by means of a Unix domain socket.
Google provide documentation and examples of how to do this. However, I dont find Google's instructions very helpful. For connecting a Python application, they give this example code:
main.py:
db = sqlalchemy.create_engine(
sqlalchemy.engine.url.URL(
drivername="mysql+pymysql",
username=db_user,
password=db_pass,
database=db_name,
query={"unix_socket": "/cloudsql/{}".format(cloud_sql_connection_name)},
),
)
My application doesnt have a main.py file, so I'm really not sure where to put this code. I have looked at the full example code for this in GitHub, but I am none the wiser about what changes I need to make in my settings.py file, or elsewhere in my application.
My question is: Do I really have to go down this route (shifting to use SQLalchemy library), or can I upgrade my Django app to work with a second generation Google cloud SQL instance simply by making some changes in my settings.py file? And, if so, what changes?
My application uses Django 1.4, Python 2.7, and I'm not using the Flask framework which Google suggest.
I learned to use Django purely for the purposes of writing this application 5 years ago, but I have not used it since - so I have forgotten pretty much everything I knew about Django and Python.
A few notes here before I start: Django 1.4 and Python 2.7 are no longer supported, so this configuration may or may not work.
You were on the right track up until the part where you introduced the SQLAlchemy instructions. These aren't Django configurations. You should be able to keep your current DATABASES syntax, provided you complete the following steps:
Ensure you add the settings to your app.yaml to allow your new database.
Ensure you have set the Cloud SQL Client (preferred) permissions to the App Engine service account.
Make sure you have mysqlclient as a Python dependency.
The Django on App Engine Flex guide uses PostgreSQL, but does include some MySQL specific suggestions that should be useful. There's also sample DATABASE configurations there.
Hope this helps!
Many thanks to glasnt for the helpful suggestions. These put me on the right track, and together with some information I found elsewhere, I got my site back up and running again - much to the delight of my client!
Here are the details of what I had to change:
I updated my app.yaml file and added:
beta_settings:
cloud_sql_instance: "my-project-name:my-region:my-db-instance"
and also:
libraries:
- name: MySQLdb
version: "latest"
I updated settings.py, and added:
import MySQLdb
and I updated the DATABASES definition to:
DATABASES = {
'default': {
'ENGINE': 'django.db.backends.mysql',
'HOST': '/cloudsql/my-project-name:my-region:my-db-instance',
'NAME': 'my-db-name',
'USER' : 'root',
'PASSWORD': 'my-db-password',
},
I found that I did not need to set up any permissions for the App Engine Service account. Google's upgrade guide, states:
When you upgrade an instance with an authorized App Engine project
from First Generation to Second Generation, Cloud SQL creates a
special service account that provides the same access as the
authorized App Engine project did before the upgrade. Because this
service account authorizes access only to a specific instance, rather
than the entire project, this service account is not visible in the
IAM service account page, and you cannot update or delete it.
So, for a migrated first-gen application, no permissions need to be configured.

Python App Engine debug/dev mode

I'm working on an App Engine project (Python) where we'd like to make certain changes to the app's behavior when debugging/developing (most often locally). For example, when debugging, we'd like to disable our rate-limiting decorators, turn on the debug param in the WSGIApplication, maybe add some asserts.
As far as I can tell, App Engine doesn't naturally have any concept of a global dev-mode or debug-mode, so I'm wondering how best to implement such a mode. The options I've been able to come up with so far:
Use google.appengine.api.app_identity.get_default_version_hostname() to get the hostname and check if it begins with localhost. This seems... unreliable, and doesn't allow for using the debug mode in a deployed app instance.
Use os.environ.get('APPLICATION_ID') to get the application id, which according to this page is automatically prepended with dev~ by the development server. Worryingly, the very source of this information is in a box warning:
Do not get the App ID from the environment variable. The development
server simulates the production App Engine service. One way in which
it does this is to prepend a string (dev~) to the APPLICATION_ID
environment variable, which is similar to the string prepended in
production for applications using the High Replication Datastore. You
can modify this behavior with the --default_partition flag, choosing a
value of "" to match the master-slave option in production. Google
recommends always getting the application ID using get_application_id,
as described above.
Not sure if this is an acceptable use of the environment variable. Either way it's probably equally hacky, and suffers the same problem of only working with a locally running instance.
Use a custom app-id for development (locally and deployed), use the -A flag in dev_appserver.py, and use google.appengine.api.app_identity.get_application_id() in the code. I don't like this for a number of reasons (namely having to have two separate app engine projects).
Use a dev app engine version for development and detect with os.environ.get('CURRENT_VERSION_ID').split('.')[0] in code. When deployed this is easy, but I'm not sure how to make dev_appserver.py use a custom version without modifying app.yaml. I suppose I could sed app.yaml to a temp file in /tmp/ with the version replaced and the relative paths resolved (or just create a persistent dev-app.yaml), then pass that into dev_appserver.py. But that seems also kinda dirty and prone to error/sync issues.
Am I missing any other approaches? Any considerations I failed to acknowledge? Any other advice?
In regards to "detecting" localhost development we use the following in our applications settings / config file.
IS_DEV_APPSERVER = 'development' in os.environ.get('SERVER_SOFTWARE', '').lower()
That used in conjunction with the debug flag should do the trick for you.

How to run Django's test database only in memory?

My Django unit tests take a long time to run, so I'm looking for ways to speed that up. I'm considering installing an SSD, but I know that has its downsides too. Of course, there are things I could do with my code, but I'm looking for a structural fix. Even running a single test is slow since the database needs to be rebuilt / south migrated every time. So here's my idea...
Since I know the test database will always be quite small, why can't I just configure the system to always keep the entire test database in RAM? Never touch the disk at all. How do I configure this in Django? I'd prefer to keep using MySQL since that's what I use in production, but if SQLite 3 or something else makes this easy, I'd go that way.
Does SQLite or MySQL have an option to run entirely in memory? It should be possible to configure a RAM disk and then configure the test database to store its data there, but I'm not sure how to tell Django / MySQL to use a different data directory for a certain database, especially since it keeps getting erased and recreated each run. (I'm on a Mac FWIW.)
If you set your database engine to sqlite3 when you run your tests, Django will use a in-memory database.
I'm using code like this in my settings.py to set the engine to sqlite when running my tests:
if 'test' in sys.argv:
DATABASE_ENGINE = 'sqlite3'
Or in Django 1.2:
if 'test' in sys.argv:
DATABASES['default'] = {'ENGINE': 'sqlite3'}
And finally in Django 1.3 and 1.4:
if 'test' in sys.argv:
DATABASES['default'] = {'ENGINE': 'django.db.backends.sqlite3'}
(The full path to the backend isn't strictly necessary with Django 1.3, but makes the setting forward compatible.)
You can also add the following line, in case you are having problems with South migrations:
SOUTH_TESTS_MIGRATE = False
I usually create a separate settings file for tests and use it in test command e.g.
python manage.py test --settings=mysite.test_settings myapp
It has two benefits:
You don't have to check for test or any such magic word in sys.argv, test_settings.py can simply be
from settings import *
# make tests faster
SOUTH_TESTS_MIGRATE = False
DATABASES['default'] = {'ENGINE': 'django.db.backends.sqlite3'}
Or you can further tweak it for your needs, cleanly separating test settings from production settings.
Another benefit is that you can run test with production database engine instead of sqlite3 avoiding subtle bugs, so while developing use
python manage.py test --settings=mysite.test_settings myapp
and before committing code run once
python manage.py test myapp
just to be sure that all test are really passing.
MySQL supports a storage engine called "MEMORY", which you can configure in your database config (settings.py) as such:
'USER': 'root', # Not used with sqlite3.
'PASSWORD': '', # Not used with sqlite3.
'OPTIONS': {
"init_command": "SET storage_engine=MEMORY",
}
Note that the MEMORY storage engine doesn't support blob / text columns, so if you're using django.db.models.TextField this won't work for you.
I can't answer your main question, but there are a couple of things that you can do to speed things up.
Firstly, make sure that your MySQL database is set up to use InnoDB. Then it can use transactions to rollback the state of the db before each test, which in my experience has led to a massive speed-up. You can pass a database init command in your settings.py (Django 1.2 syntax):
DATABASES = {
'default': {
'ENGINE':'django.db.backends.mysql',
'HOST':'localhost',
'NAME':'mydb',
'USER':'whoever',
'PASSWORD':'whatever',
'OPTIONS':{"init_command": "SET storage_engine=INNODB" }
}
}
Secondly, you don't need to run the South migrations each time. Set SOUTH_TESTS_MIGRATE = False in your settings.py and the database will be created with plain syncdb, which will be much quicker than running through all the historic migrations.
You can do double tweaking:
use transactional tables: initial fixtures state will be set using database rollback after every TestCase.
put your database data dir on ramdisk: you will gain much as far as database creation is concerned and also running test will be faster.
I'm using both tricks and I'm quite happy.
How to set up it for MySQL on Ubuntu:
$ sudo service mysql stop
$ sudo cp -pRL /var/lib/mysql /dev/shm/mysql
$ vim /etc/mysql/my.cnf
# datadir = /dev/shm/mysql
$ sudo service mysql start
Beware, it's just for testing, after reboot your database from memory is lost!
Another approach: have another instance of MySQL running in a tempfs that uses a RAM Disk. Instructions in this blog post: Speeding up MySQL for testing in Django.
Advantages:
You use the exactly same database that your production server uses
no need to change your default mysql configuration
Extending on Anurag's answer I simplified the process by creating the same test_settings and adding the following to manage.py
if len(sys.argv) > 1 and sys.argv[1] == "test":
os.environ.setdefault("DJANGO_SETTINGS_MODULE", "mysite.test_settings")
else:
os.environ.setdefault("DJANGO_SETTINGS_MODULE", "mysite.settings")
seems cleaner since sys is already imported and manage.py is only used via command line, so no need to clutter up settings
Use below in your setting.py
DATABASES['default']['ENGINE'] = 'django.db.backends.sqlite3'

In Python, how can I test if I'm in Google App Engine SDK?

Whilst developing I want to handle some things slight differently than I will when I eventually upload to the Google servers.
Is there a quick test that I can do to find out if I'm in the SDK or live?
See: https://cloud.google.com/appengine/docs/python/how-requests-are-handled#Python_The_environment
The following environment variables are part of the CGI standard, with special behavior in App Engine:
SERVER_SOFTWARE:
In the development web server, this value is "Development/X.Y" where "X.Y" is the version of the runtime.
When running on App Engine, this value is "Google App Engine/X.Y.Z".
In app.yaml, you can add IS_APP_ENGINE environment variable
env_variables:
IS_APP_ENGINE: 1
and in your Python code check if it has been set
if os.environ.get("IS_APP_ENGINE"):
print("The app is being run in App Engine")
else:
print("The app is being run locally")
Based on the same trick, I use this function in my code:
def isLocal():
return os.environ["SERVER_NAME"] in ("localhost", "www.lexample.com")
I have customized my /etc/hosts file in order to be able to access the local version by prepending a "l" to my domain name, that way it is really easy to pass from local to production.
Example:
production url is www.example.com
development url is www.lexample.com
I just check the httplib (which is a wrapper around appengine fetch)
def _is_gae():
import httplib
return 'appengine' in str(httplib.HTTP)
A more general solution
A more general solution, which does not imply to be on a Google server, detects if the code is running on your local machine.
I am using the code below regardless the hosting server:
import socket
if socket.gethostname() == "your local computer name":
DEBUG = True
ALLOWED_HOSTS = ["127.0.0.1", "localhost", ]
...
else:
DEBUG = False
ALLOWED_HOSTS = [".your_site.com",]
...
If you use macOS you could write a more generic code:
if socket.gethostname().endswith(".local"): # True in your local computer
...
Django developers must put this sample code in the file settings.py of the project.
EDIT:
According to Jeff O'Neill in macOS High Sierra socket.gethostname() returns a string ending in ".lan".
The current suggestion from Google Cloud documentation is:
if os.getenv('GAE_ENV', '').startswith('standard'):
# Production in the standard environment
else:
# Local execution.
Update on October 2020:
I tried using os.environ["SERVER_SOFTWARE"] and os.environ["APPENGINE_RUNTIME"] but both didn't work so I just logged all keys from the results from os.environ.
In these keys, there was GAE_RUNTIME which I used to check if I was in the local environment or cloud environment.
The exact key might change or you could add your own in app.yaml but the point is, log os.environ, perhaps by adding to a list in a test webpage, and use its results to check your environment.

Categories

Resources